Covering Scientific & Technical AI | Friday, December 13, 2024

Datasaur Launches LLM Lab to Build and Train Custom ChatGPT and Similar Models 

Oct. 27, 2023 -- Datasaur, a leading natural language processing (NLP) data-labeling platform, launched LLM Lab, an all-in-one comprehensive interface for data scientists and engineers to build and train custom LLM models like ChatGPT. The product will provide a wide range of features for users to test different foundation models, connect to their own internal documents, optimize server costs, and more.

The rise in LLMs being used as a tool has escalated in the past year. In fact, 61.6% of respondents in a recent survey indicated they are using LLMs (ex: ChatGPT and Github Copilot) for at least one use case such as chatbots, customer support and coding. At the same time, companies like Apple, Amazon, and Spotify are banning employee access to OpenAI services, citing business and data privacy concerns. These companies are increasingly looking to build their own internal solutions. LLM Lab provides an extensive starting point for such teams.

“We regularly connect with data science teams around the world looking to build their own LLMs,” said Ivan Lee, CEO and founder of Datasaur. “We’ve built a tool that holistically addresses the most common pain points, supports rapidly evolving best practices, and applies our signature design philosophy to simplify and streamline the process. Over the past year, we have constructed and delivered custom models for our own internal use and our clients, and from that experience, we were able to create a scalable, easy-to-use LLM product.”

Datasaur works with companies like Google and Blackbird to help label data 5.9x faster than manual labeling. The company has spent the last four years developing a comprehensive NLP solution, supporting methods like entity recognition, text classification, speaker diarization, and more. As Generative AI has captured the industry’s attention, LLM Lab complements Datasaur’s existing NLP platform to provide a one-stop shop for all things related to text, documents, and audio. The company has seen an increasing trend to adopt a hybrid approach, complementing traditional NLP models with LLM capabilities. Datasaur’s platform will now support data scientists in both approaches, even allowing them to mix approaches and use LLMs to automate data labeling for traditional models.

As we head into 2024, Datasaur will continue to invest in LLM development to fortify its position as the AI industry’s leading NLP platform. LLM Lab will help save the most successful configurations and prompts and allow users to share their findings with colleagues. It will continue integrating with popular and up-and-coming foundation models such as LlaMa 2, Falcon, and Claude, along with technologies such as Pinecone LLM to slot seamlessly into model training workflows.

To learn more about Datasaur, please visit https://datasaur.ai.

About Datasaur

Datasaur leads the NLP industry with its comprehensive and automated data labeling solution. Founded in 2019 and headquartered in Silicon Valley, the company helps financial, legal, and healthcare companies turn raw unstructured data into valuable ML datasets. Prior to Datasaur, CEO Ivan Lee sold his first company Loki Studios to Yahoo and led ML teams at Yahoo and Apple. Datasaur graduated from the Stanford StartX (F19) and YCombinator (W20) accelerators and are backed by Initialized Capital, Greg Brockman (OpenAI President) and Calvin French-Owen (Segment CTO).


Source: Datasaur

AIwire