
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
      <title>AI Wow</title>
      <link>https://wow.pjh.is/journal</link>
      <description>A study journal.</description>
      <language>en-gb</language>
      <managingEditor>wow@pjh.is (Peter Hartree)</managingEditor>
      <webMaster>wow@pjh.is (Peter Hartree)</webMaster>
      <lastBuildDate>Thu, 30 Oct 2025 15:00:00 GMT</lastBuildDate>
      <atom:link href="https://wow.pjh.is/tags/data-processing/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://wow.pjh.is/journal/make-graphs-with-claude-artifacts</guid>
    <title>Make graphs from screenshots with Claude</title>
    <link>https://wow.pjh.is/journal/make-graphs-with-claude-artifacts</link>
    undefined
    <content:encoded><![CDATA[<p>You can take a screenshot of some data, and ask Claude to make a graph.</p>
<p>This can be much faster than e.g. exporting a CSV from Mixpanel, importing to Google Sheets, then making a graph there.</p>
<p>In fact, I often use Claude to make graphs from data that's already in a Google Sheet. Reasons:</p>
<ol>
<li>
<p>Claude makes better looking graphs, with good formatting choices (e.g. informative subtitles).</p>
</li>
<li>
<p>It often makes the graphs interactive in useful ways. Sometimes it's nice to paste a screenshot of the Claude graph into a Slack discussion or Google Doc report, <em>and</em> link to the interactive artifact.</p>
</li>
<li>
<p>Using Sonnet 4.5, it typically takes 5-15 seconds. For me, that's roughly as quick as making a decent graph in Google Sheets.</p>
</li>
</ol>
<p>Say "Use React" in your prompt to force Claude to make an interactive Artifact. Without that, it'll sometimes decide to make a rather ugly graph using Python.</p>
<p><a href="https://claude.ai/public/artifacts/f7d59807-750e-4366-87e5-6d7158dc0c7d"></a></p>
<p>I've never seen OCR mistakes or hallucinations. My datasets always contain less than 100 cells. For larger datasets, I'd cross-check carefully (or just upload a CSV rather than a screenshot).</p>
<p>If you have particular requirements or style preferences for your graphs, tell Claude (once) by defining a <a href="https://support.claude.com/en/articles/12512176-what-are-skills">Claude Skill</a>.</p>
<p>ChatGPT and Gemini can do this too, but their graphs are ugly, and their defaults worse.</p>
]]></content:encoded>
    <pubDate>Thu, 30 Oct 2025 15:00:00 GMT</pubDate>
    <atom:updated>2025-10-30T15:00:00.000Z</atom:updated>
    <author>wow@pjh.is (Peter Hartree)</author>
    <category>journal</category><category>data processing</category>
  </item>

  <item>
    <guid>https://wow.pjh.is/journal/airtable-spreadsheet-assistant</guid>
    <title>The Airtable sidebar assistant is great</title>
    <link>https://wow.pjh.is/journal/airtable-spreadsheet-assistant</link>
    undefined
    <content:encoded><![CDATA[<p>The Airtable sidebar assistant ("Airtable Omni") is good.</p>
<p>I decided to play around some more, after having a great time with <a href="https://wow.pjh.is/journal/spreadsheet-data-enrichment">data enrichment</a> and specifically <a href="https://www.airtable.com/guides/scale/how-to-use-airtable-ai">Airtable AI fields</a>.</p>
<p>I started with a sheet like this:</p>
<pre><code>Name, Email, Job title, Organization, Job description
</code></pre>
<p>Some prompts I tried:</p>
<p>Result: a column with the following forula:</p>
<pre><code>"https://tally.so/r/mKjgXD?name=" &#x26; ENCODE_URL_COMPONENT(scheduler_display_name) &#x26; "&#x26;email=" &#x26; ENCODE_URL_COMPONENT(scheduler_email) &#x26; "&#x26;job_title=" &#x26; ENCODE_URL_COMPONENT({AI Job Title (internet)})
</code></pre>
<p>Perfect.</p>
<p>Next:</p>
<p>Bad. Most of the profile links were hallucinated.</p>
<p>I tried again:</p>
<p>It worked. All the links were valid, and the correct person. Some cells were correctly marked as "Profile not found".</p>
<p>Why did it work the second time? Airtable had written a prompt for the AI field, and the first draft was bad.</p>
<p>Next:</p>
<p>Result:</p>
<p>Solid.</p>
<p>It worked.</p>
<p>Result:</p>
<p>Nice!</p>
]]></content:encoded>
    <pubDate>Tue, 12 Aug 2025 10:00:00 GMT</pubDate>
    <atom:updated>2025-08-12T10:00:00.000Z</atom:updated>
    <author>wow@pjh.is (Peter Hartree)</author>
    <category>journal</category><category>data processing</category><category>workflow</category>
  </item>

  <item>
    <guid>https://wow.pjh.is/journal/spreadsheet-data-enrichment</guid>
    <title>Spreadsheet data enrichment: Google Sheets vs Shortcut vs Airtable</title>
    <link>https://wow.pjh.is/journal/spreadsheet-data-enrichment</link>
    undefined
    <content:encoded><![CDATA[<p>I have a list of names and email addresses. I want to enrich this data with job titles, organisation names, and job descriptions.</p>
<p>I tried <a href="https://sheet.new">Google Sheets</a>, <a href="https://tryshortcut.ai">Shortcut</a>, and <a href="https://airtable.com">Airtable</a>. Airtable was the best.</p>
<h2>Google Sheets can't do it</h2>
<p>The Gemini sidebar and the <code>AI</code> function can't do web searches.</p>
<p>So, Google Sheets can't help me. <sup><a href="#user-content-fn-1" id="user-content-fnref-1" data-footnote-ref aria-describedby="footnote-label">1</a></sup></p>
<h2>Shortcut was ok</h2>
<p>I asked the chat:</p>
<p>It worked! It took about 3 minutes to run through 33 names and add the info.</p>
<p>Problems:</p>
<ul>
<li>Shortcut can only import .xlsx files, not .csv. So I copy-pasted the data from Google Sheets.</li>
<li>Shortcut imitates the UI of Microsoft Excel. For a Google Sheets user, it's not immediately intuitive.</li>
<li>Shortcut is an early beta, not a mature product. I saw rough edges, and bugs.</li>
</ul>
<h2>Airtable was great</h2>
<p>I gave Airtable chat the same prompt as Shortcut.</p>
<p>Initially, the agent tried searching the web itself for each contact, right within the chat. It made it through about five contacts and then crashed.</p>
<p>Then I asked:</p>
<p>It took about 30 seconds to create the <a href="https://www.airtable.com/guides/scale/how-to-use-airtable-ai">Airtable AI fields</a>, then another minute or so to run them. This got me all the data that I wanted.</p>
<h2>Data accuracy</h2>
<p><strong>In 4/33 cases, Airtable hallucinated the job description.</strong> The hallucinated text was the same in all four cases, and taken from the correct org's website. <strong>The field prompt that Airtable generated had the obvious major problem of not telling the AI it can return "not found", or flag uncertainty:</strong></p>
<p>I fixed it by editing the prompt.</p>
<p>In some cases, people had one job title on LinkedIn and one job title on their organisation website. Shortcut and Airtable made different decisions about which one to prioritize. Ideally, the field prompts would have anticipated this edge case and told the AI to flag uncertainties like this.</p>
<p>So: next time, when I ask the Airtable assistant to write a retrieval prompt, I would explicitly say that the prompt should: (a) say "not found" if it can't find it and (b) add a "comments" column to flag uncertainty or other issues.</p>
<p>Aside from that, both Shortcut and Airtable gave mostly accurate data. They correctly marked a few contacts as "couldn't find the info".</p>
<h2>Bottom line</h2>
<p>I'll use Airtable for data enrichment going forward. It's very good, and very easy.</p>
<p>To improve accuracy: before running the prompts it generates, I'll copy-paste them into ChatGPT and ask it to check for issues. Then carefully read the final prompt myself.</p>
<h2>Appendix 1. Example use cases for Airtable AI</h2>
<section data-footnotes class="footnotes"><h2 class="sr-only" id="footnote-label">Footnotes</h2>
<ol>
<li id="user-content-fn-1">
<p>Note that you <em>can</em> import data from a specific URL using the <code>IMPORTHTML</code> or <code>IMPORTXML</code> functions. <a href="#user-content-fnref-1" data-footnote-backref="" aria-label="Back to reference 1" class="data-footnote-backref">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
    <pubDate>Thu, 07 Aug 2025 00:00:00 GMT</pubDate>
    <atom:updated>2025-08-12T00:00:00.000Z</atom:updated>
    <author>wow@pjh.is (Peter Hartree)</author>
    <category>journal</category><category>data processing</category>
  </item>

  <item>
    <guid>https://wow.pjh.is/journal/data-analysis</guid>
    <title>Data analysis: clean your CSVs, and use o3 or Gemini (not Claude)</title>
    <link>https://wow.pjh.is/journal/data-analysis</link>
    undefined
    <content:encoded><![CDATA[<p><em>Epistemic status: based on a single test, plus memory of previous results. I do "light" data analysis tasks every month, but not every week.</em></p>
<p>This week I did a personal finance review. The review required some simple spreadsheet analysis.</p>
<p>I sent the following prompt to o3, o3-pro, Claude 4 Opus (Thinking), Gemini 2.5 Pro, and Grok 3 (Thinking):</p>
<p>Observations:</p>
<ol>
<li>Initially, I uploaded a somewhat messy CSV that contained a bunch of irrelevant information. o3 and Gemini 2.5 Pro made mistakes while trying to identify the correct information.</li>
<li>With the cleaned CSV, o3, o3-pro, Gemini 2.5 Pro and Grok 3 (Thinking) did well and their numbers agreed with each other.</li>
<li>Claude 4 Opus failed.  I sent the same prompt three times: first time it gave up, second time it threw a server error (Anthropic really struggle with capacity), and third time it gave incorrect results.</li>
<li>All the models (with possible exception of Grok <sup><a href="#user-content-fn-1" id="user-content-fnref-1" data-footnote-ref aria-describedby="footnote-label">1</a></sup>) used code, without an explicit prompt to do so.</li>
</ol>
<p>My takeaways: clean your CSVs; don't use Claude.</p>
<section data-footnotes class="footnotes"><h2 class="sr-only" id="footnote-label">Footnotes</h2>
<ol>
<li id="user-content-fn-1">
<p>The Grok UI did not show that it used code. It might have done under the hood. <a href="#user-content-fnref-1" data-footnote-backref="" aria-label="Back to reference 1" class="data-footnote-backref">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
    <pubDate>Tue, 08 Jul 2025 14:00:00 GMT</pubDate>
    <atom:updated>2025-07-08T14:00:00.000Z</atom:updated>
    <author>wow@pjh.is (Peter Hartree)</author>
    <category>journal</category><category>data processing</category>
  </item>

    </channel>
  </rss>
