Newest Questions
12,769 questions
0
votes
0
answers
14
views
PPO: How to exploit action equivalences in continuous approximation of large discrete (and constrained) action spaces
I face a reinforcement learning problem where the action space is large and constrained (integer points in an n-dimensional polyhedron that depends on the state).
To train the RL agent (PPO) I make ...
0
votes
3
answers
76
views
What are some practical use cases where generative AI has saved you time or boosted creativity?
I’ve been testing out different generative AI tools recently, and I’m wondering what kinds of real, everyday use cases people here have found most useful. Not just flashy demos — I mean the tools that ...
0
votes
0
answers
17
views
Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER
I'm trying to implement the paper "OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER"
But got stuck while implementing the Load-Balancing Loss.
Could someone ...
2
votes
1
answer
669
views
How can the exact same model give different confusion matrices for the test dataset and the entire dataset?
I have recently implemented a simple artificial neural network with 1 hidden layer. I split my data using train_test_split and I end up with the following confusion matrix in my test set.
...
2
votes
1
answer
37
views
Can Self Attention capture rate of change of token?
From what I understand, the self-attention mechanism captures the dependency of a given token on various other tokens in a sequence. Inspired by nature, where natural laws are often expressed in terms ...
2
votes
1
answer
60
views
Alignment drift in LLM's
In AI security discussions I have sometimes heard that an aligned AI may drift, but I didn't find any papers which report this phenomena for current LLM's. I have found papers about LLM's faking ...
0
votes
0
answers
13
views
Create a global model from local models
Current Scenario:
So I have a task at hand. I have a data which has timestamp, org_id, no_of_calls_on_premise, no_of_calls_cloud, bw_savings. This is aggregated data on a daily basis (Also i have ...
1
vote
0
answers
31
views
Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses
We are developing a new LLM based on the CodeBERT architecture. As part of this effort, we initially trained our model using the Masked Language Modeling (MLM) objective with HuggingFace API. To ...
0
votes
0
answers
26
views
What are the best practices for using Amazon SageMaker to develop, train, and deploy ML models for beginners?
I'm a developer with some experience in machine learning using local environments (e.g., scikit-learn, TensorFlow) but new to Amazon SageMaker and cloud-based platforms. I want to understand how to ...
4
votes
1
answer
107
views
Understanding the optimal value function in RL
The definition (section 3.6 Barto Sutton) for the optimal policy states that $\pi > \pi'$ iff $v_{\pi}(s) > v_{\pi'}(s)$ for all $s \in S$.
I have difficulty understanding why the value (under ...
2
votes
1
answer
44
views
Doubt regarding the convergence proof of $Q$-learning
I was trying the understand the proof of $Q$-learning from here. At page 17
as you can see $||\Delta_t + Q^*||^2_{\infty} \leq ||\Delta_t||^2 + Q^*||^2_{\infty}$ has been used to make a bound on $Var(...
4
votes
1
answer
57
views
Are vision transformers scale invariant like CNNs?
I was trying to implement a vision transformer (RT-DETR) for object detection. I trained the model on 640x640 px images and tested it on a 2000x2000 px image containing many objects - the outputs did ...
1
vote
1
answer
17
views
Loss keep increasing when using full-batch gradient descent
I am learning linear regression model based on this tutorial. Following the example provided in the tutorial, it works fine with mini-batch stochastic gradient descent.
...
5
votes
2
answers
267
views
Does value iteration still return the true Q-values in stochastic environment?
I'm working with the FrozenLake environment (8x8) from Gymnasium.
In the deterministic case (is_slippery=False), I understand that using value iteration can ...
0
votes
0
answers
42
views
How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?
Problem Context:
I'm building a Morse code audio decoder using CNN-BiLSTM with CTC loss. My current 4-layer model achieves Levenshtein distance ≈0.6, but attempts to improve performance by adding a ...