Newest Questions

0 votes

0 answers

14 views

PPO: How to exploit action equivalences in continuous approximation of large discrete (and constrained) action spaces

I face a reinforcement learning problem where the action space is large and constrained (integer points in an n-dimensional polyhedron that depends on the state). To train the RL agent (PPO) I make ...

BotsAgainstCaptchas

11

asked Apr 16 at 12:22

0 votes

3 answers

76 views

What are some practical use cases where generative AI has saved you time or boosted creativity?

I’ve been testing out different generative AI tools recently, and I’m wondering what kinds of real, everyday use cases people here have found most useful. Not just flashy demos — I mean the tools that ...

FaceSwapAI

1

asked Apr 16 at 9:40

0 votes

0 answers

17 views

Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

I'm trying to implement the paper "OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER" But got stuck while implementing the Load-Balancing Loss. Could someone ...

qmzp

1

asked Apr 16 at 7:01

2 votes

1 answer

669 views

How can the exact same model give different confusion matrices for the test dataset and the entire dataset?

I have recently implemented a simple artificial neural network with 1 hidden layer. I split my data using train_test_split and I end up with the following confusion matrix in my test set. ...

The Logician

21

asked Apr 15 at 10:03

2 votes

1 answer

37 views

Can Self Attention capture rate of change of token?

From what I understand, the self-attention mechanism captures the dependency of a given token on various other tokens in a sequence. Inspired by nature, where natural laws are often expressed in terms ...

Manish Kumar Singh

21

asked Apr 14 at 21:41

2 votes

1 answer

60 views

Alignment drift in LLM's

In AI security discussions I have sometimes heard that an aligned AI may drift, but I didn't find any papers which report this phenomena for current LLM's. I have found papers about LLM's faking ...

user47175

23

asked Apr 14 at 10:09

0 votes

0 answers

13 views

Create a global model from local models

Current Scenario: So I have a task at hand. I have a data which has timestamp, org_id, no_of_calls_on_premise, no_of_calls_cloud, bw_savings. This is aggregated data on a daily basis (Also i have ...

Kush Rohra

1

asked Apr 13 at 3:29

1 vote

0 answers

31 views

Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses

We are developing a new LLM based on the CodeBERT architecture. As part of this effort, we initially trained our model using the Masked Language Modeling (MLM) objective with HuggingFace API. To ...

One Bad Student

11

asked Apr 11 at 15:29

0 votes

0 answers

26 views

What are the best practices for using Amazon SageMaker to develop, train, and deploy ML models for beginners?

I'm a developer with some experience in machine learning using local environments (e.g., scikit-learn, TensorFlow) but new to Amazon SageMaker and cloud-based platforms. I want to understand how to ...

dimuth k

1

asked Apr 11 at 13:47

4 votes

1 answer

107 views

Understanding the optimal value function in RL

The definition (section 3.6 Barto Sutton) for the optimal policy states that $\pi > \pi'$ iff $v_{\pi}(s) > v_{\pi'}(s)$ for all $s \in S$. I have difficulty understanding why the value (under ...

ahron

265

asked Apr 11 at 6:58

2 votes

1 answer

44 views

Doubt regarding the convergence proof of $Q$-learning

I was trying the understand the proof of $Q$-learning from here. At page 17 as you can see $||\Delta_t + Q^*||^2_{\infty} \leq ||\Delta_t||^2 + Q^*||^2_{\infty}$ has been used to make a bound on $Var(...

Subhajit Saha

121

asked Apr 10 at 19:49

4 votes

1 answer

57 views

Are vision transformers scale invariant like CNNs?

I was trying to implement a vision transformer (RT-DETR) for object detection. I trained the model on 640x640 px images and tested it on a 2000x2000 px image containing many objects - the outputs did ...

Lockhart

143

asked Apr 10 at 15:17

1 vote

1 answer

17 views

Loss keep increasing when using full-batch gradient descent

I am learning linear regression model based on this tutorial. Following the example provided in the tutorial, it works fine with mini-batch stochastic gradient descent. ...

hguser

101

asked Apr 10 at 11:38

5 votes

2 answers

267 views

Does value iteration still return the true Q-values in stochastic environment?

I'm working with the FrozenLake environment (8x8) from Gymnasium. In the deterministic case (is_slippery=False), I understand that using value iteration can ...

Jien Weng

69

asked Apr 10 at 1:15

0 votes

0 answers

42 views

How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?

Problem Context: I'm building a Morse code audio decoder using CNN-BiLSTM with CTC loss. My current 4-layer model achieves Levenshtein distance ≈0.6, but attempts to improve performance by adding a ...

alexander

9

asked Apr 9 at 19:35

Stack Exchange Network

PPO: How to exploit action equivalences in continuous approximation of large discrete (and constrained) action spaces

What are some practical use cases where generative AI has saved you time or boosted creativity?

Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

How can the exact same model give different confusion matrices for the test dataset and the entire dataset?

Can Self Attention capture rate of change of token?

Alignment drift in LLM's

Create a global model from local models

Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses

What are the best practices for using Amazon SageMaker to develop, train, and deploy ML models for beginners?

Understanding the optimal value function in RL

Doubt regarding the convergence proof of $Q$-learning

Are vision transformers scale invariant like CNNs?

Loss keep increasing when using full-batch gradient descent

Does value iteration still return the true Q-values in stochastic environment?

How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?

Hot Network Questions

Newest Questions

Related Tags