Questions tagged [linear-regression]
For questions related to the theory or application of linear regression.
73 questions
1
vote
1
answer
44
views
Loss keep increasing when using full-batch gradient descent
I am learning linear regression model based on this tutorial. Following the example provided in the tutorial, it works fine with mini-batch stochastic gradient descent.
...
1
vote
1
answer
83
views
How does Linear regression Help Pla?
so I got this question for my Lab:
Q: Show how linear regression for classification can improve pocket algorithm with
PLA.
I thought linear regression was bad for classification? So how can linear ...
2
votes
1
answer
61
views
Lstm outperform linear regression in shorter time series
We know that It's quite common for simple methods to perform well, especially out-of-sample (which is where it matters). This effect becomes stronger on short series.
I have TEC data of time series. ...
0
votes
1
answer
63
views
Understanding Generalization Error in Empirical vs True Risk Illustration
I am trying to understand the concept of generalization error based on the attached illustration that contrasts empirical risk (𝑅_hat) with true risk (𝑅)
Two regions are marked in the diagram:
Red-...
1
vote
3
answers
117
views
Why use a neural network approach for linear regression if there is already an explicit solution
I have noticed that many introductory materials on neural networks use linear regression to predict house prices as the standard first example on neural networks. This seems to be a common practice.
...
1
vote
1
answer
101
views
Regression model is doing exceptionally very well on time series
I have the following task to do: I have time series data. Training by the consecutive 3 days to predict the each 4th day. Each day data represents one CSV file which has dimension 24x25. Every ...
0
votes
1
answer
74
views
Is a linear regression model able to figure out the relation of division among two features?
I have a dataset that consists of data about students. The features are things such as passed credits, failed credits, ...
0
votes
1
answer
67
views
collaborative filtering using linear regression
Currently doing andrew ng's unsupervised learning specialization, I came across this algorithm for collaborative filtering:
here the Xi refers to feature vector of objects(ex: action in movies, ...
0
votes
0
answers
39
views
How is the system of equations that generates β0 and β1 in simple linear regression solved, resulting in the formulas?
good evening, I wanted to know how the system of equations is solved step by step to arrive at the formulas for β0 and β1 in simple linear regression. In the following picture, there is the system of ...
1
vote
1
answer
79
views
Why same learning rate for slope and intercept not working in Linear regression?
I'm a new student in AI, currently learning linear regression. I used the california housing dataset for doing my experiments. My goal is to predict the 'population' column based on the 'total_rooms' ...
2
votes
1
answer
640
views
How to make a model forget specific training it has received?
Does L1/L2 (NAdam weight decay) really make the model "unlearn"?
Ok so my question might be dumb but is there any way to "unlearn" a model - and yeah I know there is wieght_decay ...
1
vote
1
answer
107
views
Unclear steps in derivation of normal equations in linear regression using linear algebra approach
How are eqs.(3.55) and (3.56) obtained? Especially, it is unclear how triangle inequality implies eq.(3.56) because we have squared norms.
0
votes
1
answer
207
views
Can multiple linear regression using the least squares(OLS) method, also be used to solve simple linear regression problems? Would both be equivalent?
Simple Linear Regression reference:
https://online.stat.psu.edu/stat462/node/93/
Multiple Linear Regression reference:
https://online.stat.psu.edu/stat462/node/131/
I see that the way to calculate the ...
0
votes
1
answer
126
views
Can Adaline do multiple linear regression being equivalent to the least squares method?
https://en.wikipedia.org/wiki/ADALINE
Can Adaline(Adaptive Linear Neuron) be used to do a multiple linear regression being equivalent to the least squares method?
1
vote
1
answer
129
views
How to compute an estimate of the expected value of a stochastic random variable in Reinforcement Learning?
In the section on LSTD in SuttonBarto's book on RL, there is a proof on convergence of semi-gradient TD(0) using a linear function approximator.
Later on they estimated A and b as
I was under the ...
1
vote
2
answers
343
views
Why my best fit line is not having a single straight line | Multiple Linear Regression
I am working on Multiple Linear Regression (Multiple variables). I am been able to predict and get a good r2 score. But I am not sure that I understood the part of plotting the best fit line, I can't ...
2
votes
2
answers
150
views
In logistic regression, do I try to fit the graph perfectly or mimimize the error in the predicted probabilities?
In linear regression, I train the model so the graph runs best through the data points, so the geometric distance between f(x) and $y^i$ is minimized.
Now, is it correct that in logistic regression I ...
0
votes
2
answers
291
views
cross_val_score of sklearn and LinearRegression scoring method
The function cross_val_score uses the estimator’s default scorer (if available) and LinearRgression (the estimator I use) uses The coefficient of determination (which is defined as $R^2 = 1 - \frac{u}...
-1
votes
1
answer
74
views
SQL Machine Learning using matrix multiplication
What is the easiest classification algorithm in SQL when my data looks like this?
...
0
votes
1
answer
77
views
Multi-layer network only predicts linear trends
I have made a neural network from scratch (in java), which is refusing to switch out of linear regression. I have pushed up the layer sizes (it now has 2 hidden layers, both with 5 neurons), and yet ...
1
vote
0
answers
86
views
Linear Actor Critic for continuing task and 1 continuous action => Any comment?
I wish to implement an Actor Critic agent using linear functions for a continuing task with one continuous action. Below the resulting pseudo-code I have reached by my own (the initialization part is ...
0
votes
2
answers
2k
views
Training a regression model on a set of values in 0-1 range to give 0-1 continual values
I have a textual dataset that has a set of real numbers as labels: L={0.0, 0.33, 0.5, 0.75, 1.0}, and I have a model that takes the text as input and has a Sigmoid output.
If I train the model on this ...
0
votes
1
answer
81
views
Simple Polynomial Gradient Descent algorithm not working
I am trying to implement a simple 2nd order polynomial gradient descent algorithm in Java. It is not converging and becomes unstable. How do I fix it?
...
-1
votes
1
answer
279
views
Why is the cross-entropy a cost function?
The question looks foolish, but I think cross-entropy is somewhat weird as a cost function.
As a cost function for linear regression, the mean square error $ \sum_{i=1}^{n} (y_i - (ax_i+b)) ^2$ seems ...
1
vote
1
answer
921
views
Not able to understand Pytorch Tensor (Weight & Biases) Size for Linear Regression
Below are the two tensors
...
0
votes
1
answer
413
views
Which of the following two implementations of a Least Squares classifier in Python is correct?
I am trying to solve a classification problem by implementing the Least Squares algorithm in Python. To solve this problem, I am implementing the linear algebra formula to train the classifier, which ...
2
votes
3
answers
337
views
Is there any domain in machine learning that solves a problem by using only analytical algorithms?
Most of the algorithms in machine learning I am aware of use datasets and learning happens in an iterative manner given some examples. The examples can also be understood as experience in the case of ...
2
votes
1
answer
934
views
Would either $L_1$ or $L_2$ regularisation lower the MSE on the training and test data?
Consider linear regression. The mean squared error (MSE) is 120.5 for the training dataset. We've reached the minimum for the training data.
Is it possible that by applying Lasso (L1 regularization) ...
0
votes
1
answer
79
views
Which machine learning technique can I use to match one set of data points to another?
I have two measuring devices. Both measure the same thing. One is accurate, the other is not, but does correlate with a non-fixed offset, some outliers, and some noise.
I won't always be using the ...
1
vote
0
answers
164
views
Linear output layer back propagation
So I'm stack to something that it's probably very easy but I can't get my head around it. I'm building a Neural Network that will consist of many layers with non-linear activation functions (probably ...
8
votes
1
answer
4k
views
Is there a connection between the bias term in a linear regression model and the bias that can lead to under-fitting?
Here is a linear regression model
$$y = mx + b,$$
where $b$ is known as $y$-intercept, but also known as the bias [1], $m$ is the slope, and $x$ is the feature vector.
As I understood, in machine ...
1
vote
1
answer
287
views
How parameter adjustment works in Gradient Descent?
I am trying to comprehend how the Gradient Descent works.
I understand we have a cost function which is defined in terms of the following parameters,
$J(𝑤_{1},𝑤_{2},.... , w_{n}, b)$
the derivative ...
2
votes
0
answers
120
views
Is there a UCB type algorithm for linear stochastic bandit with lasso regression?
Why is there no upper confidence bound algorithm for linear stochastic bandits that uses lasso regression in the case that the regression parameters are sparse in the features?
In particular, I don't ...
0
votes
1
answer
357
views
Hyper-plane in logistic regression vs linear regression for same number of features
Geometric interpretation of Logistic Regression and Linear regression is considered here.
I was going through Logistic regression and Linear regression. In the optimization equation of both following ...
0
votes
1
answer
77
views
Effect of adding an Independent Variable in Multiple Linear Regression
I am new in machine learning and learning linear regression concept. Please help with answers to below queries.
I want to understand effect on existing independent variable(X1) if I add a new ...
0
votes
3
answers
204
views
What ML algorithm should I use that suits this data?
What if I have some data, let's say I'm trying to answer if education level and IQ affect earnings, and I want to analyze this data and put in a regression model to predict earnings based on the IQ ...
2
votes
1
answer
1k
views
Do correlations matter when building neural networks?
I am new to working with neural networks. However, I have built some linear regression models in the past. My question is, is it worth looking for features with a correlation to my target variable as ...
2
votes
1
answer
303
views
Why is the hypothesis function $h_{\theta}(x)$ equivalent to $E[y | x; \theta]$ in generalised linear models?
Reading through the CS229 lecture notes on generalised linear models, I came across the idea that a linear regression problem can be modelled as a Gaussian distribution, which is a form of the ...
1
vote
1
answer
144
views
If features are always positives, why do we use RELU activation functions?
When does it happen that a layer (either first or hidden) outputs negative values in order to justify the use of RELU?
As far as I know, features are never negative or converted to negative in any ...
1
vote
0
answers
61
views
3d representation of a regression with two independent variables one of them is categorical and another is continuous
I have hopefully a fundamental question of Do I understand things right.
(Thank you in advance and sorry for my English which might be not so good)
1-Preambula 1:
I know that if we have 2 independent ...
1
vote
1
answer
93
views
Is there any way to apply linear transformations on a vector other than matrix multiplication?
I am trying to optimize the cost function calculation in regression analysis using a non-matrix multiplication based approach.
More specifically, I have a point $x = (1, 1, 2, 3)$, to which I want to ...
2
votes
1
answer
293
views
Do I need to denormalise results in linear regression?
I have learned so far how to linear regression with one or multiple features. So far, so good, everything seems to work fine, at least for my first simple examples.
However, I now need to normalise my ...
2
votes
0
answers
82
views
What is the difference between an generalised estimating equation and a recurrent neural network?
What is the difference between a generalised estimating equation (GEE) model and a recurrent neural network (RNN) model, in terms of what these two models are doing? Apart from the differences in the ...
2
votes
1
answer
808
views
What is the difference between linear and non-linear regression?
In machine learning, I understand that linear regression assumes that parameters or weights in equation should be linear. For Example:
$$y = w_1x_1 + w_2x_2$$
is a linear equation where $x_1$ and $...
1
vote
2
answers
441
views
How do we choose the activation function for each hidden node? [duplicate]
I am new to neural networks. I would like to use them as a fitting or forecasting method.
A simple NN model that does not contain hidden layers, that is, the input nodes are directly connected to the ...
0
votes
2
answers
173
views
Is it still called linear separation with a layer of more than 1 neuron
A single neuron will be able to do linear separation. For example, XOR simulator network:
...
0
votes
1
answer
90
views
Solution to classify product names
I have a bunch of training data for classifying product names, around 30,000 samples. The task is to classify these product names into types of product, around 100 classes (single words).
For example:...
1
vote
1
answer
139
views
TensorFlow estimator DNNClassifier fails to fit simple data
The ready-to-use DNNClassifier in tf.estimator seems not able to fit these data:
...
2
votes
4
answers
246
views
How is regression machine learning?
In regression, in order to minimize an error function, a functional form of hypothesis $h$ must be decided upon, and it must be assumed (as far as I'm concerned) that $f$, the true mapping of instance ...
2
votes
1
answer
495
views
Calculating Parameter value Using Gradient Descent for Linear Regression Model
Consider the following data with one input (x) and one output (y):
(x=1, y=2)
(x=2, y=1)
(x=3, y=2)
Apply linear regression on this data, using the hypothesis $h_Θ(x) = Θ_0 + Θ_1 x$, where $...