Skip to main content

Questions tagged [objective-functions]

For questions related to the concept of loss (or cost) function in the context of machine learning.

Filter by
Sorted by
Tagged with
0 votes
2 answers
79 views

I am using a machine learning model to remove interference from range-doppler maps to detect targets. I am using a supervised approach, in which I give as input the range-doppler map of target+...
ThinkPad's user avatar
3 votes
1 answer
107 views

With the popularity of AIs from every media source this year,im interested in learning more about them and maybe one day build a good one.I have this code in python: ...
Root Groves's user avatar
2 votes
1 answer
95 views

Are human content moderators for social media websites needed anymore with AI? In other words, is AI so good now that it can detect if an image is pornographic, obscene, or in any violating social ...
Geremia's user avatar
  • 577
0 votes
0 answers
85 views

I have a multi label classification problem, where I was initially using a binary cross entropy loss and my labels are one hot encoded. I found a paper similar to my application and have used ...
ThinkPad's user avatar
1 vote
1 answer
55 views

In gradient descent for neural networks, we optimize over a loss surface defined by our loss function L(W) where W represents the network weights. However, since there are infinitely many possible ...
semahaissa's user avatar
0 votes
0 answers
45 views

I am trying to train a multi-label video classification model. My dataset consists of just one video, sampled at 1fps. I have a total of 12k frames and 21 classes, and in a single frame multiple ...
Berk Ali Çam's user avatar
1 vote
1 answer
107 views

I am training Deep Learning models to predict the Remaining Useful Life (RUL) of certain devices. The RUL is an estimate of the time remaining until the device is expected to fail. Accurate ...
user386164's user avatar
0 votes
0 answers
101 views

I am now trying to train a GAN called AOT-GAN to do some inpainting operation on some anodized aluminium surfaces. At the beginning, I used a canon camera to take the photos for training the AOT-GAN....
galoischan's user avatar
1 vote
1 answer
158 views

I'm doing a research about actor-critic methods and I want to make sure that I understand these methods right. First of all, I understand that as it's a combination of value-based and policy-based ...
marc_spector's user avatar
1 vote
1 answer
102 views

I have a question regarding how the expected return of a deterministic policy in written. I have seen that in some cases the use the Q-Function as it is shown in the part Objective function ...
marc_spector's user avatar
0 votes
0 answers
93 views

Typically in supervised learning, a neural networks' output is compared to the targets through a loss function, and the gradients are backpropagated. Is it a bad idea to also have a loss function on ...
Liubove's user avatar
  • 11
2 votes
1 answer
65 views

I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as: summation over (i,j): (1-e(x_i)(...
Darkmoon Chief's user avatar
2 votes
1 answer
152 views

I am working with a feedforward neural network to fit the following simple function: N(1) = -1 N(2) = -1 N(3) = 1 N(4) = -1 But I don't want to use the Mean-...
Andrew Baker's user avatar
0 votes
1 answer
145 views

I have a rather large ML framework that takes multiple conditional probability terms that are computed via classifiers/neural networks. This arbitrary loss function is computed via a function: ...
QuantumPanda's user avatar
2 votes
0 answers
94 views

It is widely believed that synaptic plasticity is the way biological brains learn. Artificial implementations of this mechanism are for instance local weight-update rules in Spiking Neural Networks. ...
Alex's user avatar
  • 121
0 votes
1 answer
52 views

I've been attempting to mess around with Sparse Categorical Cross Entropy Loss for the MNIST dataset. I can't seem to figure out what might be wrong with my implementation, the loss seems to ...
tensor's user avatar
  • 125
0 votes
0 answers
64 views

I'm working on a reinforcement learning problem where the environment returns a reward pair $(r_{t+1}^{(a)}, r_{t+1}^{(b)})$. The goal is to maximize the following nonlinear objective function. $$ E[\...
Alex's user avatar
  • 1
1 vote
1 answer
112 views

It is known that multitask objectives in neural networks sometimes have the effect of improving the performance of the neural network for each of the tasks individually (versus training the same ...
Alexander Soare's user avatar
2 votes
2 answers
97 views

In the famous AI book Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig (4th edition), in chapter 3, the action cost function of a problem solver agent denoted as $c(s, a, ...
user153245's user avatar
2 votes
1 answer
110 views

Been reviewing some old foundational material and ran into this comment by Hinton on Rprop in his old Coursera class: Rprop is equivalent to using the gradient, but also dividing by the size of the ...
eof's user avatar
  • 121
0 votes
0 answers
63 views

I'm working on a project where a non differentiable loss is there. I'm thinking about how should I deal with them. My model is a very big lstm model (about 1M parameter), and after 500 steps (not sure ...
TWTom's user avatar
  • 13
1 vote
0 answers
104 views

I'm looking into Learning to Rank models - specifically, the LGBMRanker model - and I want to understand how it's able to train. It takes in features, group sizes and labels, and optimizes for a ...
Shirish's user avatar
  • 413
1 vote
0 answers
77 views

I am studying machine learning and wanted to work on a project of my own so that I have better chances after graduating college. I'm studying the application of ML to improve searches using a toy ...
user9343456's user avatar
1 vote
1 answer
123 views

I'm currently studying reinforcement learning through CS 285 provided by UC Berkeley. At 1:52 of the part 5 of the lecture 11, I got confused on why one would want to learn an observation model $p(o_t ...
platoDev's user avatar
2 votes
2 answers
150 views

In linear regression, I train the model so the graph runs best through the data points, so the geometric distance between f(x) and $y^i$ is minimized. Now, is it correct that in logistic regression I ...
Jacky02's user avatar
  • 21
1 vote
0 answers
75 views

Is a gradient descent step always supposed to decrease loss? I can think of a situation where it would seem that gradient descent would increase loss but maybe it I am misunderstanding a part of ...
Mike Levi's user avatar
1 vote
2 answers
91 views

I am trying to do multi-spectral image fusion. I am using the following paper as a reference. https://arxiv.org/pdf/1804.08361.pdf The code available on GitHub works well. But, I am trying to add some ...
programmer_04_03's user avatar
1 vote
0 answers
1k views

Two objective functions are used during the BERT language model pretraining step. The first one is masked language model (MLM) that randomly masks 15% of the input tokens and the objective is to ...
XYZ's user avatar
  • 121
4 votes
1 answer
2k views

I am optimizing a neural network with Adam using 3 different losses. Their scale is very different, and the current method is to either sum the losses and clip the gradient or to manually weight them ...
Simon's user avatar
  • 263
0 votes
1 answer
60 views

I have a neural network that predicts 2 classes of a time series (bottom and top). Currenlty my Y labels are size 2: [1 0] for bottom and [0 1] for top. The NN has 2 output nodes. Of course not every ...
dorien's user avatar
  • 226
0 votes
1 answer
494 views

Let's say I have a binary classification problem and I want to solve it by means of FC neural net. So which approach will be correct: 1) define the last layer of NN like this ...
dmasny's user avatar
  • 23
0 votes
1 answer
5k views

What's the difference between classification and segmentation in deep learning? In particular, can the classification loss function be used for segmentation problems?
lllittleX's user avatar
2 votes
1 answer
180 views

I have a use case where the model needs to detect fabricdefects. There are 15+ different kinds of defects. In one image there can be multiple defects present. The straight forward solution for this ...
Nick De Wispelaere's user avatar
1 vote
1 answer
827 views

Deep learning book chapter 6: In 6.2.1.2 last paragraph: Unfortunately, mean squared error and mean absolute error often lead to poor results when used with gradient-based optimization. Some output ...
vivian.ai's user avatar
0 votes
1 answer
129 views

SigmoidBinaryCrossEntropyLoss implementation in DJL accepts two kinds of outputs from NNs: where sigmoid activation has already been applied. where raw NN output ...
src091's user avatar
  • 1
1 vote
0 answers
69 views

I have a binary classification problem, where there are multiple correct predictions, however, I would consider the prediction to be correct if the highest confidence prediction of a 1 is correct. I ...
John Meighan's user avatar
0 votes
1 answer
90 views

I am training an auto-encoder over $10^4$ epochs. I get a converging learning curve. However the error at the last stages stays huge $\sim10^{15}$. What does this mean? does it mean that my auto-...
devCharaf's user avatar
  • 101
1 vote
0 answers
145 views

This is a follow up on the already asked question: Is the neural network 100% accurate on training data if epoch loss is minimized to 0? I want to train a neural network that works as an approximator ...
Acad's user avatar
  • 111
1 vote
0 answers
448 views

Xu et al. (2022) distinguishes between popular pre-training methods for language modeling: (see Section 2.1 PRETRAINING METHODS) Left-to-Right: Auto-regressive, Left-to-right models, predict the ...
keyboardAnt's user avatar
1 vote
1 answer
109 views

I will start with an example, in order to get to the general question. I was reading the following paper (https://www.cns.nyu.edu/pub/lcv/wang03-preprint.pdf) about Structural Similarity Index (SSIM), ...
Theo Deep's user avatar
  • 205
1 vote
1 answer
271 views

From what I understand, a Generative Adversarial Network (GAN) is composed of an encoder (generator), some synthetic data (fake data) and a discriminator that will penalize any distinguishable real ...
Rhesus's user avatar
  • 13
3 votes
1 answer
236 views

CrossEntropyLoss optimizes the overall classification accuracy as $$ {n_{\text{correct}} \over N} $$ What loss function should I use if I only care about increasing the true positive rate of one class?...
em1971's user avatar
  • 183
0 votes
2 answers
123 views

I have voice recordings which are labelled by not only a single label but multiple labels. Each voice recording corresponds to one of class labels within a set. In other words, the training instance ...
MilTom's user avatar
  • 113
10 votes
1 answer
10k views

What is the difference between the triplet loss and the contrastive loss? They look same to me. I don't understand the nuances between the two. I have the following queries: When to use what? What ...
Exploring's user avatar
  • 381
1 vote
2 answers
874 views

I'm trying to get my toy network to learn a sine wave. I output (via tanh) a number between -1 and 1, and I want the network to minimise the following loss, where ...
cjm2671's user avatar
  • 113
1 vote
2 answers
583 views

I've read that the discriminator $D$ validates an image $D(x)$, where $x$ is either a real image or a fake one created by the generator, i.e. $ D(G(x))$. What does the function of the discriminator ...
Lukas Pezzei's user avatar
2 votes
0 answers
51 views

We're working on a sequence-to-sequence problem using pytorch, and are using cross-entropy to calculate the loss when comparing the output sequence to the target sequence. This works fine and ...
vgoklani's user avatar
  • 121
3 votes
1 answer
334 views

In the original DQN paper, the $\ell_2$ loss is taken over the distance between our network output, $\hat{q}(s_j,a_j,w)$ and the labels $y_j=r_j+\gamma \cdot \max\limits_{a'} \hat{q}(s_{j+1},a',w^-)$, ...
Hadar Sharvit's user avatar
1 vote
0 answers
66 views

I'm fairly new to deep learning and looking for some reference literature... Specifically, I want to train a neural network to predict vectors $v \in \mathbb{R}^3$ under the constraint $||v||\leq 1$. ...
Lilla's user avatar
  • 111
0 votes
1 answer
137 views

It just occurred to me that this seems like it should be a very common problem that must have some kind of solution... Yet I'm not sure what it is... If there is no solution, does this mean once a ...
profPlum's user avatar
  • 566

1
2 3 4 5 6