Transform your data with natural language using LLMs
Datatune is a Python library that lets you transform tabular data using natural language prompts powered by Large Language Models. Unlike traditional tools that rely on schema-based queries, Datatune gives the LLM access to your actual data—enabling intelligent, context-aware transformations at the row level that go far beyond what SQL or pandas can express.
Try out the interactive examples below to see Datatune in action!
Extract and categorize information from unstructured data
Use Case: Map operations let you add new columns to your dataset by extracting, classifying, or transforming information from existing columns. This is perfect for enriching your data with insights that would be tedious to code manually—like categorizing products, analyzing sentiment, or extracting structured information from text.
Filter rows based on semantic criteria across multiple columns
Use Case: Filter operations let you remove rows based on complex, semantic criteria that would be difficult to express with traditional code. Instead of writing nested if-statements or regex patterns, describe what you want to keep or remove in natural language. Perfect for data cleanup, quality control, and targeted analysis.
Protect sensitive information while preserving data utility
Use Case: When working with sensitive data, you need to protect privacy while maintaining data utility for analysis. Datatune can intelligently redact PII (Personally Identifiable Information), PHI (Protected Health Information), or other sensitive data based on context—much smarter than simple find-and-replace operations.
Transform and extract data with natural language prompts
Remove rows based on semantic criteria
Let AI plan and execute complex transformations
Process datasets larger than LLM context windows