Newest Questions

0 votes
0 answers
14 views

PPO: How to exploit action equivalences in continuous approximation of large discrete (and constrained) action spaces

I face a reinforcement learning problem where the action space is large and constrained (integer points in an n-dimensional polyhedron that depends on the state). To train the RL agent (PPO) I make ...
BotsAgainstCaptchas's user avatar
0 votes
3 answers
76 views

What are some practical use cases where generative AI has saved you time or boosted creativity?

I’ve been testing out different generative AI tools recently, and I’m wondering what kinds of real, everyday use cases people here have found most useful. Not just flashy demos — I mean the tools that ...
FaceSwapAI's user avatar
0 votes
0 answers
17 views

Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

I'm trying to implement the paper "OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER" But got stuck while implementing the Load-Balancing Loss. Could someone ...
qmzp's user avatar
  • 1
2 votes
1 answer
669 views

How can the exact same model give different confusion matrices for the test dataset and the entire dataset?

I have recently implemented a simple artificial neural network with 1 hidden layer. I split my data using train_test_split and I end up with the following confusion matrix in my test set. ...
The Logician's user avatar
2 votes
1 answer
37 views

Can Self Attention capture rate of change of token?

From what I understand, the self-attention mechanism captures the dependency of a given token on various other tokens in a sequence. Inspired by nature, where natural laws are often expressed in terms ...
Manish Kumar Singh's user avatar
2 votes
1 answer
60 views

Alignment drift in LLM's

In AI security discussions I have sometimes heard that an aligned AI may drift, but I didn't find any papers which report this phenomena for current LLM's. I have found papers about LLM's faking ...
user47175's user avatar
0 votes
0 answers
13 views

Create a global model from local models

Current Scenario: So I have a task at hand. I have a data which has timestamp, org_id, no_of_calls_on_premise, no_of_calls_cloud, bw_savings. This is aggregated data on a daily basis (Also i have ...
Kush Rohra's user avatar
1 vote
0 answers
31 views

Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses

We are developing a new LLM based on the CodeBERT architecture. As part of this effort, we initially trained our model using the Masked Language Modeling (MLM) objective with HuggingFace API. To ...
One Bad Student's user avatar
0 votes
0 answers
26 views

What are the best practices for using Amazon SageMaker to develop, train, and deploy ML models for beginners?

I'm a developer with some experience in machine learning using local environments (e.g., scikit-learn, TensorFlow) but new to Amazon SageMaker and cloud-based platforms. I want to understand how to ...
dimuth k's user avatar
4 votes
1 answer
107 views

Understanding the optimal value function in RL

The definition (section 3.6 Barto Sutton) for the optimal policy states that $\pi > \pi'$ iff $v_{\pi}(s) > v_{\pi'}(s)$ for all $s \in S$. I have difficulty understanding why the value (under ...
ahron's user avatar
  • 265
2 votes
1 answer
44 views

Doubt regarding the convergence proof of $Q$-learning

I was trying the understand the proof of $Q$-learning from here. At page 17 as you can see $||\Delta_t + Q^*||^2_{\infty} \leq ||\Delta_t||^2 + Q^*||^2_{\infty}$ has been used to make a bound on $Var(...
Subhajit Saha's user avatar
4 votes
1 answer
57 views

Are vision transformers scale invariant like CNNs?

I was trying to implement a vision transformer (RT-DETR) for object detection. I trained the model on 640x640 px images and tested it on a 2000x2000 px image containing many objects - the outputs did ...
Lockhart 's user avatar
1 vote
1 answer
17 views

Loss keep increasing when using full-batch gradient descent

I am learning linear regression model based on this tutorial. Following the example provided in the tutorial, it works fine with mini-batch stochastic gradient descent. ...
hguser's user avatar
  • 101
5 votes
2 answers
267 views

Does value iteration still return the true Q-values in stochastic environment?

I'm working with the FrozenLake environment (8x8) from Gymnasium. In the deterministic case (is_slippery=False), I understand that using value iteration can ...
Jien Weng's user avatar
0 votes
0 answers
42 views

How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?

Problem Context: I'm building a Morse code audio decoder using CNN-BiLSTM with CTC loss. My current 4-layer model achieves Levenshtein distance ≈0.6, but attempts to improve performance by adding a ...
alexander's user avatar

15 30 50 per page