Newest Questions
12,751 questions
0
votes
0
answers
7
views
Perplexity AI File Uploads Blocked on Restricted Network — What Domains/IPs Are Required? [closed]
I’m trying to enable file uploads in Perplexity AI on a desktop browser, but uploads consistently fail when using a network with iptables-based outbound restrictions.
Context
Perplexity web UI loads ...
0
votes
0
answers
5
views
What is the Gold Standard for Scaling Adam training
I just read this paper which blew my mind. They said that Adam primarily relies on "the sign of the gradient" for updates (due to normalization). Also it would explain why this scaling paper ...
2
votes
0
answers
18
views
Identifying and Reducing Persistent Artifacts in Neural Vocoders
I’m evaluating a few neural vocoders (HiFi-GAN, Vocos, etc.) and I’m seeing the same type of artifact across all of them. I’m not sure exactly what it is, and I’m looking for help identifying it and ...
0
votes
0
answers
24
views
Rethinking My Deep-Research Agent Workflow — Should We Move Beyond Static Trees? [closed]
I’m reevaluating a deep-research workflow I built earlier and would love some advice.
My previous design used a static tree workflow (fixed width/depth, node = search → extract → summarize → generate ...
0
votes
0
answers
11
views
How to improve mAP accuracy with custom yaml file in yolo step by step?
I am trying to improve my RTTS dataset mAP accuracy by adding layers or sth in yolo yaml files.
Firstly I tried normal training with yolov8n.pt for RTTS dataset and mAP result is 0.74
Secondly I added ...
-2
votes
0
answers
9
views
How to reward AI Direction Shifters sustainably via Work Done? [closed]
I want to explore how Movement DAO can effectively connect my cognitive vision with the DAO’s economic architecture. Specifically:
1. AI Direction Shifters & Work Done Metrics
• Visionaries (“...
0
votes
1
answer
27
views
Recasting optimization as an RL problem
Recently, I read a paper about optimizing airfoil geometry using reinforcement learning. For simplicity, let's say that we want the airfoil to have a high coefficient of lift. What the paper does is ...
0
votes
0
answers
29
views
How to properly change Yolov8 yaml file and improve accuracy comparison with normal training?
I'm trying to increase the mAP value of the RTTS dataset, and for this, I've made changes to yolo8's architecture and added new layers.
With the standard yolov8n.pt ...
-1
votes
0
answers
34
views
Why is Perplexity losing context if the thread is getting longer?
I’ve noticed a recurring issue with Perplexity AI when working on longer conversations or complex tasks. As the thread grows, Perplexity starts losing previously provided context—instructions, details,...
-1
votes
0
answers
28
views
Perplexity web app logs out when multiple tabs are open (Firefox 115.12.0esr ) [closed]
I am seeing a repeatable logout issue in the Perplexity AI web app and would like to understand whether this is expected session behavior or a browser‑side problem.
Environment
Product: Perplexity AI ...
-1
votes
1
answer
44
views
Missing tokenizer.json file - GGUF Conversion [closed]
When converting mistralai/Mistral-Small-3.2-24B-Instruct-2506 to GGUF (via llama_cpp), I get an error saying the tokenizer.json ...
0
votes
0
answers
4
views
When Using Bidirectional Encoders Should The Onedirectional Decoder Have Twice The Dimension (Autoencoders)?
Just so we can avoid misunderstandings in the root: In general an Autoencoder tries to capture the complexity of an input in a latent space that is unable to hold all that input. It then tries to ...
3
votes
1
answer
238
views
Is Clipping Necessary for PPO?
I believe I have a decent understanding of PPO, but I also feel that it could be stated in a simpler, more intuitive way that does not involve the clipping function. That makes me wonder if there is ...
0
votes
1
answer
43
views
What are the best Python library to implementation neural network modification algorithms? [closed]
I want to implement in python some algorithms from a paper that allow for a pre-trained neural network to be modified (adding or removing neurons or layers) conserving (theoretically) the outputs of ...
1
vote
1
answer
23
views
Stable Baselines vectorized envs wall clock performance not improving with no. of processes
I am following the Colab in this link to learn about vectorized environments in SB3. The Colab guides you through 3 experiments, where it explains that simply adding more processes without changing ...
1
vote
1
answer
28
views
How does critic influence actor in "Encoder-Core-Decoder" (in shared and separate network)?
I'm learning RL and understand the basic actor-critic concept, but I'm confused about the technical details of how the critic actually influences the actor during training. Here's my current ...
1
vote
0
answers
34
views
ComfyUI + Flux.1-dev + 30 GiB RAM: Can you speed up the same workflow with 2 GPUs?
I am using Flux.1-dev text to image model for inference through ComfyUI in Kaggle. Everything works but I noticed that Kaggle offers a second GPU inside the notebook. If I try to run two instances of ...
-1
votes
0
answers
41
views
How to choose the best latent dimension for the VAE?
How to choose the best latent dimension for the VAE?
From which loss should we see the total training loss at the end of epoch or average training loss?
I am working on brain connectivity matrices. I ...
1
vote
2
answers
69
views
What is the exact difference between a fully RNN and an Elman Network?
What is the exact difference between a fully RNN and an Elman Network? I have my lecture notes defining the Elman Network as
\begin{align}
\textbf{s}(t) &= \textbf{W} \textbf{x}(t) + \textbf{a}(t-...
0
votes
2
answers
103
views
If LLMs like OpenAI / DeepSeek / Gemini exist, why do we still need ML or NLP libraries, now and in the future?
I’m new to AI and NLP, and I’m trying to understand how different tools fit together.
Large Language Models (LLMs) like OpenAI, DeepSeek, or Gemini can already handle many NLP tasks text ...
0
votes
1
answer
33
views
Most efficient way to put together different input to nodes to have better results
Suppose you have a input layer consisting of the set of input nodes: ${i_{0},i_{1},..., i_{n}}$.
Is there a proof that a combination or a permutation of some nodes with each other would lead to more ...
3
votes
1
answer
113
views
Why do AI language models overuse em dashes compared to human writers?
I've noticed a consistent pattern in AI-generated text: frequent overuse of em dashes (—), sometimes multiple times in a single paragraph. In contrast, in common human writing—even in the sources AI ...
0
votes
1
answer
68
views
Open AI Electricity Requirement Matrix Multiplication
The article states OpenAI would require 30million GPUs for a data center consuming 250GW.
What is the matrix multiplication portion for this power requirement?
Edit:
I am looking for percentages of ...
0
votes
0
answers
31
views
Could you suggest literature or conceptual approaches for fusing any two arbitrary input representations in DRL?
I am working on a university project exploring Contextual Reinforcement Learning (CRL) using Actor-Critic algorithms (like PPO and SAC). My focus is on how to effectively integrate the state ...
-1
votes
1
answer
30
views
How to create FREE RAG with out put same as Claude or near to it
I want to create RAG for personal use and want to use one of the Pretrained Model
Embeddings , so which will be the most useful considering i dont want to pay for api/tokens.
Thanks
0
votes
1
answer
53
views
How to infer probability if training with logits?
So I train with smooth_l1_loss directly with the range of -5 to 5 where it indicates a result for that position in the tensor. (without activation)
Whenever you do label classification you do binary ...
8
votes
6
answers
3k
views
Why can’t an artificial neural network figure out what a lava lamp will do next?
I was watching a lava lamp and started wondering if a computer could ever predict what the blobs are going to do next.
It seems like, in theory, you could record tons of video frames, feed them to a ...
0
votes
1
answer
75
views
Finding the right setup for a local LLM for big contexts [closed]
i am trying to figure out what i would need for a setup to do the following task:
i have a korean text about 10-20 pages. i need to translate it, anonymize it, and also swap out some words with ...
0
votes
1
answer
61
views
How does the Bellman equation handle a scenario like this?
My background: Medical student here who dabbled a little bit into computational psychiatry. I came across the Bellman equation in an introductory text on computational psychiatry and tried to read up ...
0
votes
0
answers
34
views
RAG on legal documents: Is JSON preprocessing necessary before chunking?
I'm currently working on a legal RAG system that will ingest several laws from my country. I have these laws as PDFs.
The structure of these laws is: TITLE → CHAPTER → SECTION → ARTICLE.
Example (...
2
votes
2
answers
99
views
Why are LLMs for coding not more focused?
When asking llama3.3:70b about its supported natural and programming languages it lists more than a dozen each. As a user I am usually asking questions in one natural language for one programming ...
0
votes
1
answer
109
views
Why does ChatGPT go on unending rambles when asking it certain prompts?
Certain prompts, like "is there a seahorse emoji?" or "are there any NFL teams ending in s?", which both have the answer "no", trigger some sort of haywire response in ...
0
votes
0
answers
32
views
Why am I unable to make the TD3 algorithm overfit on the task of placing points in 3d space?
I am trying to train a TD3 algorithm to place points in 3d space.
However, I am currently not able to even get the model to overfit on a small number of data points.
As far as I can tell, part of the ...
0
votes
1
answer
40
views
How to convert intuition into feature vectors for computing similarity/closeness?
Kind of a broad question, but let me narrow it down to one specific use-case/example I'm currently working on / interested in: Finding "closeness" or "similarity" of writing system ...
1
vote
1
answer
59
views
Plateau in performance of DQN snake AI [closed]
I'm currently making an AI to play snake using DQN and have run into a performance plateau. Here is the information about the architecture of the model.
Network's design:
I use CNN + MLP for both ...
1
vote
1
answer
62
views
How to interpret validation loss spikes in YOLO training? Is this a sign of overfitting?
I have trained a YOLO model using an augmented version of the DAWN dataset, which I obtained from Roboflow. The training ran for 100 epochs, and the resulting metrics, such as mAP50 reaching around 0....
1
vote
1
answer
72
views
In reinforcement learning, how is exploration efficiency quantitatively measured across environments with different entropy?
Many RL papers discuss exploration strategies (like UCB or entropy bonuses), but their success depends heavily on the environment’s entropy. How do researchers formally normalize or compare ...
0
votes
0
answers
21
views
In DDPG, is it okay for the polyak tau value to be larger than critics learning rate?
In DDPG, is it okay for the polyak tau value to be larger than critics learning rate? Since if i am right, the critic target network will be updated from the main critic network. And if the main ...
0
votes
0
answers
35
views
How can I detect suspicious customer actions using computer vision?
I'm designing a computer vision system to detect suspicious customer behavior in a store, for example:
unusual body movements near a cashier or shelf
sudden hiding motions,
loitering for too long in ...
0
votes
1
answer
43
views
How to manage discussion between two persons
I want to code (C++) a method allowing a character C1 to ask or request something from another character C2.
The answer of C2 will be environment related:
does it knows the thing C1 is looking for?
...
1
vote
1
answer
37
views
What is the magnitude of a gradient jump to be considered an evidence of the exploding/vanishing issue?
I'm having some issues with the training of a convolutional neural network, as the loss initially decreases but suddenly it becames nan. I guess the problem could be related to some exploding/...
0
votes
1
answer
23
views
Does Azure OpenAI or Amazon Bedrock Store the data sent via API calls?
I have some client data that is filled with PII information. I want to use Azure or AWS LLM models, but I am afraid they will use this data for further training or send it to some third party. Could ...
0
votes
0
answers
18
views
How can I use structured JSON data from PostgreSQL to populate LightRAG’s Neo4j graph without letting the LLM hallucinate relationships?
I’m working on a hybrid RAG (Retrieval-Augmented Generation) system that combines:
Structured data from PostgreSQL
A Neo4j graph database
LightRAG for hybrid (graph + vector) search
I want to use ...
0
votes
1
answer
26
views
Adaptive Prefix Tuning
With reference to this paper Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning, I have following difficulties in understanding the implementation:
• "$h_{i−1}$ is ...
1
vote
1
answer
42
views
What is the step-by-step process used by generative AI when altering image content?
I asked generative AI to change a small object in a complicated image:
Using computer vision (or human operator using Photoshop), I understand the steps to be performed as follows:
Search for object ...
0
votes
0
answers
26
views
How are multi-domain datasets prepared for mid-sized language models (4B–7B) to ensure coherent generalization?
When training mid-sized language models (around 4B–7B parameters), how are datasets designed to maintain coherence and balance across distinct domains such as code, science, and general text?
I am ...
1
vote
0
answers
23
views
Recent results on real-time agent problems?
Most real-time formulation tries to tackle the problem from an efficiency perspective, i.e. if we can get agents react in a very small time $t << T$ (where $T$ is the environment update time), ...
0
votes
1
answer
47
views
What are the pros and cons of this algorithm for training of an MLP?
I got the following problem in a Computational Intelligence course exam.
Analyze the following formulas for training of an MLP as an alternative training algorithm for MLPs. Tell the pros and cons of ...
0
votes
0
answers
17
views
Are vector distances still relevant if embeddings are created with the same model but different quantization?
I use bge-m3 model to create embeddings and store them to postgres/pgvector.
I am curious if I can:
use F16 quantization during data creation and storage.
then use Q4_K_M quantization for user search/...
0
votes
1
answer
37
views
What are the privacy and transparency advantages of open-source voice-to-AI tools like Ito compared to closed systems?
I’ve been exploring open-source projects that connect speech recognition with large language models for intelligent voice input.
Recently I came across Ito, an open-source “voice-to-AI” interface that ...