- AI for Social Impact Newsletter
- Posts
- 📨 AI for Social Impact Deep Dive: Data, Data, Data
📨 AI for Social Impact Deep Dive: Data, Data, Data
It's not the most exciting topic, but you should definitely read this.
✍🏼 A Note From the Editor
Welcome to your Data Deep Dive! The least exciting topic of all, yet the most important when we consider, not only the backbone of AI, but also how our data is structured and how that translates into various use cases for operational efficiency. In the world of computer science, the adage “garbage in, garbage out” tells you everything you need to know about data.
💻 Data & Model Training
AI tools learn by studying massive amounts of data. During training, these models process billions of data points, identifying patterns, relationships, and structures that help them predict what should come next. This is why AI companies are constantly seeking more data to train their models.
Despite the world's data doubling every three to four years, experts say AI models are actually running out of training data. “AI is able to ingest data faster than we can generate new data it hasn’t seen before.” And it’s less about the volume of data and more about the variety of data needed to train models. Researchers are now exploring "synthetic data" (computationally generated datasets based on physics, chemistry, and biology principles) to push AI capabilities forward.
📊 Structured Data
Structured data is information that's labeled. It is organized in a predictable, consistent format (think spreadsheets, databases, and forms). This type of information is relatively easy for AI systems to process because it's already categorized and labeled in ways machines can understand.
AI tools can quickly spot trends in structured data—like identifying which program interventions correlate with the strongest outcomes. The challenge is keeping data clean, especially with staff transitions, inconsistent labeling taxonomies, and well, a lack of time/resources/capacity to prioritize data hygiene.
😵💫 Unstructured Data
Unstructured data is everything else: program narratives, beneficiary testimonials, grant proposals, meeting notes, photos, videos. Until recently, this information was incredibly difficult for computers to analyze because it doesn't follow predictable patterns, requires understanding context and nuance, and often combines multiple formats (text, images, audio).
This is where modern AI capabilities come into play. Multimodal large language models can read and understand unstructured text, analyze images and videos, and recognize speech to transcribe and analyze audio. This means you can extract insights from grant reports, beneficiary stories, and focus groups that previously required human analysis and hours of review. Of course, we are always the “humans in the loop” to ensure that the AI output is accurate and up to our standards. Also, be sure to omit sensitive information when using AI tools to analyze unstructured data if privacy is a concern.
🔒 Data Privacy & Security
Hopefully you know this by now, but if you don’t, it bears repeating. Most free AI tools use your inputs to improve their models (read: you are paying with your data), so your data becomes part of their training data. If you pay for a tool, you can opt out of model training (the default usually automatically opts you in, so double check your settings!).
Big Tech companies now offer enterprise plans in which data is not used for training and is processed with encryption and security protocols. If you are a Microsoft (Copilot) or a Google Workspace (Gemini) user, you can check out what your current tech stack offers. Read about their privacy settings and AI data training policies to help you decide what tools you may want to sanction for organization-wide use.
And regardless of which tool you decide to use, you may want to develop a data governance framework for your organization to ensure that your team is all on the same page regarding data hygiene and usage.
👋🏼 About AI for Social Impact
I’m Joanna, and I’m on a mission to help folks in the social impact sector understand, experiment with, and responsibly adopt AI. We don’t have time to waste, but we also can’t get left behind.
Let’s move the sector forward together. 💫
♥️ Spread the Love
Spread the love and forward this newsletter to anyone who might benefit from a dose of AI inspo!
Thank you for being part of the community. 🫶🏼