Newest 'objective-functions' Questions

0 votes

2 answers

79 views

What loss function to choose that will assign a higher penalty to false negatives than to false positives for regression task?

I am using a machine learning model to remove interference from range-doppler maps to detect targets. I am using a supervised approach, in which I give as input the range-doppler map of target+...

ThinkPad

43

asked Aug 12 at 13:13

3 votes

1 answer

107 views

Universally better activation /loss function or specific-case dependency?

With the popularity of AIs from every media source this year,im interested in learning more about them and maybe one day build a good one.I have this code in python: ...

Root Groves

159

asked Jul 27 at 23:28

2 votes

1 answer

95 views

Are human content moderators needed anymore with AI?

Are human content moderators for social media websites needed anymore with AI? In other words, is AI so good now that it can detect if an image is pornographic, obscene, or in any violating social ...

Geremia

577

asked Jun 21 at 21:50

0 votes

0 answers

85 views

how to use contrastive loss function for multi label classification?

I have a multi label classification problem, where I was initially using a binary cross entropy loss and my labels are one hot encoded. I found a paper similar to my application and have used ...

ThinkPad

43

asked Feb 15 at 8:29

1 vote

1 answer

55 views

How can gradient descent optimize a loss surface that's never fully computed?

In gradient descent for neural networks, we optimize over a loss surface defined by our loss function L(W) where W represents the network weights. However, since there are infinitely many possible ...

semahaissa

11

asked Feb 15 at 8:11

0 votes

0 answers

45 views

How to write a custom loss for multi-label video classification?

I am trying to train a multi-label video classification model. My dataset consists of just one video, sampled at 1fps. I have a total of 12k frames and 21 classes, and in a single frame multiple ...

Berk Ali Çam

1

asked Jan 9 at 11:43

1 vote

1 answer

107 views

Loss function that penalizes errors more at low values

I am training Deep Learning models to predict the Remaining Useful Life (RUL) of certain devices. The RUL is an estimate of the time remaining until the device is expected to fail. Accurate ...

user386164

121

asked Jan 7 at 7:18

0 votes

0 answers

101 views

sudden NaN in the loss function of training a GAN for inpainting(AOT-GAN) I am sure there is no Nan in the input

I am now trying to train a GAN called AOT-GAN to do some inpainting operation on some anodized aluminium surfaces. At the beginning, I used a canon camera to take the photos for training the AOT-GAN....

galoischan

1

asked Dec 21, 2024 at 14:25

1 vote

1 answer

158 views

Are these objective and loss functions from Actor-Critic Methods correct?

I'm doing a research about actor-critic methods and I want to make sure that I understand these methods right. First of all, I understand that as it's a combination of value-based and policy-based ...

marc_spector

57

asked Dec 7, 2024 at 11:42

1 vote

1 answer

102 views

Expected return formula for deterministic policy

I have a question regarding how the expected return of a deterministic policy in written. I have seen that in some cases the use the Q-Function as it is shown in the part Objective function ...

marc_spector

57

asked Nov 30, 2024 at 10:09

0 votes

0 answers

93 views

Loss function on intermediate layers of the networks

Typically in supervised learning, a neural networks' output is compared to the targets through a loss function, and the gradients are backpropagated. Is it a bad idea to also have a loss function on ...

Liubove

11

asked Nov 20, 2024 at 21:02

2 votes

1 answer

65 views

Do we plug in the old values or the new values during the gradient descent update?

I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as: summation over (i,j): (1-e(x_i)(...

Darkmoon Chief

31

asked Nov 5, 2024 at 10:07

2 votes

1 answer

152 views

Custom Loss Function Traps Network in Local Optima

I am working with a feedforward neural network to fit the following simple function: N(1) = -1 N(2) = -1 N(3) = 1 N(4) = -1 But I don't want to use the Mean-...

Andrew Baker

21

asked Aug 15, 2024 at 17:20

0 votes

1 answer

145 views

Using conditional probability as an estimate in a loss function

I have a rather large ML framework that takes multiple conditional probability terms that are computed via classifiers/neural networks. This arbitrary loss function is computed via a function: ...

QuantumPanda

101

asked Jul 8, 2024 at 22:41

2 votes

0 answers

94 views

Can local learning rules minimize a global loss?

It is widely believed that synaptic plasticity is the way biological brains learn. Artificial implementations of this mechanism are for instance local weight-update rules in Spiking Neural Networks. ...

Alex

121

asked Jun 14, 2024 at 7:40

0 votes

1 answer

52 views

Sparse Cross Entropy

I've been attempting to mess around with Sparse Categorical Cross Entropy Loss for the MNIST dataset. I can't seem to figure out what might be wrong with my implementation, the loss seems to ...

tensor

125

asked May 16, 2024 at 1:31

0 votes

0 answers

64 views

Optimizing a nonlinear objective function in Deep Reinforcement Learning

I'm working on a reinforcement learning problem where the environment returns a reward pair $(r_{t+1}^{(a)}, r_{t+1}^{(b)})$. The goal is to maximize the following nonlinear objective function. $$ E[\...

Alex

1

asked May 13, 2024 at 13:34

1 vote

1 answer

112 views

Multi-task objective sometimes improve single-task performance, but is this true when fine tuning?

It is known that multitask objectives in neural networks sometimes have the effect of improving the performance of the neural network for each of the tasks individually (versus training the same ...

Alexander Soare

1,389

asked Mar 15, 2024 at 10:27

2 votes

2 answers

97 views

Why does an action cost function dependes on result state in search problems?

In the famous AI book Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig (4th edition), in chapter 3, the action cost function of a problem solver agent denoted as $c(s, a, ...

user153245

195

asked Feb 24, 2024 at 11:28

2 votes

1 answer

110 views

Can you explain the Hinton's comment "Rprop is equivalent to using the gradient, but also dividing by the size of the gradient"?

Been reviewing some old foundational material and ran into this comment by Hinton on Rprop in his old Coursera class: Rprop is equivalent to using the gradient, but also dividing by the size of the ...

eof

121

asked Feb 4, 2024 at 11:32

0 votes

0 answers

63 views

Non differentiable loss function train with actor critic style

I'm working on a project where a non differentiable loss is there. I'm thinking about how should I deal with them. My model is a very big lstm model (about 1M parameter), and after 500 steps (not sure ...

TWTom

13

asked Jan 24, 2024 at 0:04

1 vote

0 answers

104 views

How do LGBM rankers train?

I'm looking into Learning to Rank models - specifically, the LGBMRanker model - and I want to understand how it's able to train. It takes in features, group sizes and labels, and optimizes for a ...

Shirish

413

asked Oct 10, 2023 at 5:52

1 vote

0 answers

77 views

Search recall optimization - what appropriate loss function to use?

I am studying machine learning and wanted to work on a project of my own so that I have better chances after graduating college. I'm studying the application of ML to improve searches using a toy ...

user9343456

181

asked Sep 27, 2023 at 17:16

1 vote

1 answer

123 views

why learn an observation model when training latent space model in model based rl

I'm currently studying reinforcement learning through CS 285 provided by UC Berkeley. At 1:52 of the part 5 of the lecture 11, I got confused on why one would want to learn an observation model $p(o_t ...

platoDev

13

asked Sep 14, 2023 at 2:29

2 votes

2 answers

150 views

In logistic regression, do I try to fit the graph perfectly or mimimize the error in the predicted probabilities?

In linear regression, I train the model so the graph runs best through the data points, so the geometric distance between f(x) and $y^i$ is minimized. Now, is it correct that in logistic regression I ...

Jacky02

21

asked Jun 30, 2023 at 12:30

1 vote

0 answers

75 views

Can gradient descent cause loss to increase in some situations?

Is a gradient descent step always supposed to decrease loss? I can think of a situation where it would seem that gradient descent would increase loss but maybe it I am misunderstanding a part of ...

Mike Levi

11

asked Jun 18, 2023 at 21:02

1 vote

2 answers

91 views

How do I assign a weight to an additional loss?

I am trying to do multi-spectral image fusion. I am using the following paper as a reference. https://arxiv.org/pdf/1804.08361.pdf The code available on GitHub works well. But, I am trying to add some ...

programmer_04_03

73

asked Jun 10, 2023 at 22:13

1 vote

0 answers

1k views

What is MLM & NSP loss function

Two objective functions are used during the BERT language model pretraining step. The first one is masked language model (MLM) that randomly masks 15% of the input tokens and the objective is to ...

XYZ

121

asked May 26, 2023 at 5:01

4 votes

1 answer

2k views

What is the best way to combine or weight multiple losses with gradient descent?

I am optimizing a neural network with Adam using 3 different losses. Their scale is very different, and the current method is to either sum the losses and clip the gradient or to manually weight them ...

Simon

263

asked May 24, 2023 at 17:29

0 votes

1 answer

60 views

Which loss / activation function with 2 classes that do not occur often and do not sum to one?

I have a neural network that predicts 2 classes of a time series (bottom and top). Currenlty my Y labels are size 2: [1 0] for bottom and [0 1] for top. The NN has 2 output nodes. Of course not every ...

dorien

226

asked Mar 18, 2023 at 7:16

0 votes

1 answer

494 views

What is the correct loss function for binary classification: Cross entropy or Binary cross entropy?

Let's say I have a binary classification problem and I want to solve it by means of FC neural net. So which approach will be correct: 1) define the last layer of NN like this ...

dmasny

23

asked Jan 25, 2023 at 5:53

0 votes

1 answer

5k views

What's the difference between classification and segmentation in deep learning?

What's the difference between classification and segmentation in deep learning? In particular, can the classification loss function be used for segmentation problems?

lllittleX

1

asked Jan 13, 2023 at 2:37

2 votes

1 answer

180 views

Image classification problem with multiple right classes

I have a use case where the model needs to detect fabricdefects. There are 15+ different kinds of defects. In one image there can be multiple defects present. The straight forward solution for this ...

Nick De Wispelaere

21

asked Dec 1, 2022 at 11:09

1 vote

1 answer

827 views

Why MSE and MAE yield poor results when used with gradient-based optimization for classification?

Deep learning book chapter 6: In 6.2.1.2 last paragraph: Unfortunately, mean squared error and mean absolute error often lead to poor results when used with gradient-based optimization. Some output ...

vivian.ai

27

asked Nov 4, 2022 at 14:01

0 votes

1 answer

129 views

Why is `SigmoidBinaryCrossEntropyLoss` in `DJL` implemented this way?

SigmoidBinaryCrossEntropyLoss implementation in DJL accepts two kinds of outputs from NNs: where sigmoid activation has already been applied. where raw NN output ...

src091

1

asked Oct 19, 2022 at 5:42

1 vote

0 answers

69 views

Loss Function for Binary Classification with Multiple Correct Choices

I have a binary classification problem, where there are multiple correct predictions, however, I would consider the prediction to be correct if the highest confidence prediction of a 1 is correct. I ...

John Meighan

11

asked Oct 16, 2022 at 18:49

0 votes

1 answer

90 views

Learning curve converges with huge errors

I am training an auto-encoder over $10^4$ epochs. I get a converging learning curve. However the error at the last stages stays huge $\sim10^{15}$. What does this mean? does it mean that my auto-...

devCharaf

101

asked Oct 12, 2022 at 15:37

1 vote

0 answers

145 views

Training a neural network simultaneously with two different loss functions rather than considering the weighted sum

This is a follow up on the already asked question: Is the neural network 100% accurate on training data if epoch loss is minimized to 0? I want to train a neural network that works as an approximator ...

Acad

111

asked Sep 28, 2022 at 16:32

1 vote

0 answers

448 views

Left-to-Right vs Encoder-decoder Models

Xu et al. (2022) distinguishes between popular pre-training methods for language modeling: (see Section 2.1 PRETRAINING METHODS) Left-to-Right: Auto-regressive, Left-to-right models, predict the ...

keyboardAnt

39

asked Sep 20, 2022 at 22:28

1 vote

1 answer

109 views

Do we need to know or verify properties of loss functions / metrics' implementations?

I will start with an example, in order to get to the general question. I was reading the following paper (https://www.cns.nyu.edu/pub/lcv/wang03-preprint.pdf) about Structural Similarity Index (SSIM), ...

Theo Deep

205

asked Sep 19, 2022 at 16:16

1 vote

1 answer

271 views

Is the discriminator of a GAN network embedded in VAE?

From what I understand, a Generative Adversarial Network (GAN) is composed of an encoder (generator), some synthetic data (fake data) and a discriminator that will penalize any distinguishable real ...

Rhesus

13

asked Aug 2, 2022 at 21:55

3 votes

1 answer

236 views

What loss function should I use if I only care about the accuracy of one class?

CrossEntropyLoss optimizes the overall classification accuracy as $$ {n_{\text{correct}} \over N} $$ What loss function should I use if I only care about increasing the true positive rate of one class?...

em1971

183

asked Jul 19, 2022 at 8:13

0 votes

2 answers

123 views

How to define a loss function for multi-label problem?

I have voice recordings which are labelled by not only a single label but multiple labels. Each voice recording corresponds to one of class labels within a set. In other words, the training instance ...

MilTom

113

asked Jul 7, 2022 at 10:50

10 votes

1 answer

10k views

What is the difference between the triplet loss and the contrastive loss?

What is the difference between the triplet loss and the contrastive loss? They look same to me. I don't understand the nuances between the two. I have the following queries: When to use what? What ...

Exploring

381

asked Jun 18, 2022 at 19:00

1 vote

2 answers

874 views

What should I think about when designing a custom loss function?

I'm trying to get my toy network to learn a sine wave. I output (via tanh) a number between -1 and 1, and I want the network to minimise the following loss, where ...

cjm2671

113

asked Jun 9, 2022 at 15:14

1 vote

2 answers

583 views

What is the domain of the discriminator of a GAN?

I've read that the discriminator $D$ validates an image $D(x)$, where $x$ is either a real image or a fake one created by the generator, i.e. $ D(G(x))$. What does the function of the discriminator ...

Lukas Pezzei

19

asked Jun 6, 2022 at 20:08

2 votes

0 answers

51 views

How to create a loss function that penalizes duplicate indices in the output tensor?

We're working on a sequence-to-sequence problem using pytorch, and are using cross-entropy to calculate the loss when comparing the output sequence to the target sequence. This works fine and ...

vgoklani

121

asked Jun 1, 2022 at 16:42

3 votes

1 answer

334 views

Why do we use "true labels" that are based on the output of our network in Deep Q-Learning?

In the original DQN paper, the $\ell_2$ loss is taken over the distance between our network output, $\hat{q}(s_j,a_j,w)$ and the labels $y_j=r_j+\gamma \cdot \max\limits_{a'} \hat{q}(s_{j+1},a',w^-)$, ...

Hadar Sharvit

381

asked May 30, 2022 at 7:40

1 vote

0 answers

66 views

Learning values in open ball: which final layers to employ?

I'm fairly new to deep learning and looking for some reference literature... Specifically, I want to train a neural network to predict vectors $v \in \mathbb{R}^3$ under the constraint $||v||\leq 1$. ...

Lilla

111

asked Apr 28, 2022 at 20:38

0 votes

1 answer

137 views

How is catastrophic cancellation dealt with in loss functions?

It just occurred to me that this seems like it should be a very common problem that must have some kind of solution... Yet I'm not sure what it is... If there is no solution, does this mean once a ...

profPlum

566

asked Apr 25, 2022 at 16:49

Questions tagged [objective-functions]