NLP Collective
Discussions
Browse discussion posts about NLP.
Newest — sorts discussions by their creation dates, with the newest at the top.
Latest activity — sorts discussions by their reply, creation or edited dates (latest first).
Highest score — sorts discussions by their total votes (highest first).
Your sorting method preferences will be saved.
Lawyer building a LegalTech chatbot — advice needed on architecture
Hi everyone, I'm a practicing lawyer venturing into the world of programming. My goal is to build a simple LegalTech chatbot that can help people answer basic legal questions (e.g., related to tenancy,...
Is R efficient for sentiment analysis?
I would like to explore more about sentiment analysis but I cannot decide if I should start a project in python or R. What would you suggest?
Spacy model training (NER task)
Hello everyone! I am training a spaCy model (version 3.8.4) for a Named Entity Recognition (NER) task. While the overall workflow functions correctly, I have observed an inconsistency during the ...
Why sudden drop of Both training and Validation loss in a transformer based mode?
I am finetuning a fairly a small model and I have seen some sudden drops after first step in validation loss and traning loss. Could anyone explain the reason behind it ? or is it a good sign? Step ...
Need guidance on a document version control project
Hello, I have a document version control project where basically two things needs to be done: identify which document is the latest of them what are the historical version control changes on the ...
What is your ideal development environment for deep learning/NLP?
I'm curious on what are the ideal development setups for NLP developers who train deep learning models? I know in my journey to work more with deep learning, I use Jupyter Notebooks a lot, but I'm ...
Getting started with natural language processing. Does it could be helpful for making online communities healthy and safe?
I'm new to natural language processing, actually I'm new to artificial intelligence. I'm interested on online communities health and safety and I'm wondering if NLP could be helpful and how to get ...
Alternative to OpenVINO's in-training optimization?
From optimum-intel's docs: Training-time optimization methods are deprecated and will be removed in optimum-intel v1.22.0. What are the alternatives to minimize quality loss in quantization? For ...
Reinforcement learning from human feedback vs. instruction tuned models: performance in open benchmarks
Why is there only a single Reinforcement Learning via Human Feedback (RLHF) model in the top 20 of the OpenLLM benchmark? Should we consider introducing more versatile datasets? Link to the ...
Literature Based Discovery, where to start?
Hello there! I was recently starting to look at the topic of Literature Based Discovery (LBD), which is basically "a form of knowledge extraction and automated hypothesis generation that uses ...
Utilizing transformer for Robot's path planning
How could I use transformer model for Robot motion planning? Most of the works in literature used transformer models for NLP tasks. Transformers have the encoder and decoder parts. Is it possible to ...
An fantastic idea about using several sentences to represent another sentence in NLP
When learning NLP, I found that the current representation methods are basically word representation, so I wonder if there is a sentence representation? My hypothesis: to represent sentences using ...
Identifying the type of sources from User Prompts using Machine Learning
I am currently working on a project where we need to map user entered prompts to predefined sources based on its content descriptions. The goal here is not to extract named entities but to understand ...
How are OCR texts post-processed to increase accuracy of recognition?
Has anyone worked in a company where they extract large amounts of text using OCR and then clean the text to be as accurate as possible? How is this done? Say I digitize a lot of legal documents, run ...
Langchain VS LlamaIndex with LLMs for data ingestion
Would like to know your experience about LangChain VS LlamaIndex with LLMs for data ingestion? Comparison with real data would be helpful Data formats can be API's, PDF's, documents, SQL, etc. Query ...
Simply submit a proposal, get it approved, and publish it.
See how the process works