<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:media="http://search.yahoo.com/mrss/"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Science Archives - AI SCKOOL</title>
	<atom:link href="https://aisckool.com/category/data-science/feed/" rel="self" type="application/rss+xml" />
	<link>https://aisckool.com/category/data-science/</link>
	<description>All About Artificial Intelligence</description>
	<lastBuildDate>Sun, 07 Jun 2026 06:49:25 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://aisckool.com/wp-content/uploads/2024/05/cropped-8FDB48F0-2148-449F-B10B-86E84E56DAD5-removebg-preview-1-e1716890217940-32x32.png</url>
	<title>Data Science Archives - AI SCKOOL</title>
	<link>https://aisckool.com/category/data-science/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>How to write to files in Python: a beginner&#8217;s guide</title>
		<link>https://aisckool.com/how-to-write-to-files-in-python-a-beginners-guide/</link>
					<comments>https://aisckool.com/how-to-write-to-files-in-python-a-beginners-guide/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Sun, 07 Jun 2026 06:49:25 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27385</guid>

					<description><![CDATA[<p># Entry Writing to files is an indispensable Python skill. It allows you to save data permanently instead of losing it when you stop the program. You can utilize file saving to store results, logs, reports, user input, settings, and structured data. In this guide, you&#8217;ll learn how to create text files, write multiple lines, [&#8230;]</p>
<p>The post <a href="https://aisckool.com/how-to-write-to-files-in-python-a-beginners-guide/">How to write to files in Python: a beginner&#8217;s guide</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div id="post-">
<p> </p>
<h2><span># </span>Entry</h2>
<p>Writing to files is an indispensable Python skill. It allows you to save data permanently instead of losing it when you stop the program. You can utilize file saving to store results, logs, reports, user input, settings, and structured data.</p>
<p>In this guide, you&#8217;ll learn how to create text files, write multiple lines, attach content, work with folders, and save data in CSV and JSON formats. You&#8217;ll also learn about the most popular file modes, including <code style="background: #F5F5F5;">w</code>, <code style="background: #F5F5F5;">a</code>, <code style="background: #F5F5F5;">x</code>AND <code style="background: #F5F5F5;">r</code>and when to utilize each one.</p>
<p>By the end, you will be able to write Python programs that write results, reports, logs, and structured data to files.</p>
</p>
<h2><span># </span>Saving the first text file</h2>
<p>The easiest way to write to a file is to utilize the built-in Python language <code style="background: #F5F5F5;">open()</code> function.</p>
<p>The <code style="background: #F5F5F5;">w</code> mode means recording mode. If the file does not exist, Python creates it. If the file already exists, Python replaces its existing contents.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>file = open("message.txt", "w")
file.write("Hello, this is my first file written with Python.")
file.close()</code></pre>
</div>
<p>When you run this code, Python creates a file called <code style="background: #F5F5F5;">message.txt</code> in the same folder where the notebook or script is located.</p>
<p>You can re-read the file to see what was written.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>file = open("message.txt", "r")
content = file.read()
file.close()

print(content)</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>Hello, this is my first file written with Python.</code></pre>
</div>
<h2><span># </span>Using <code>with open()</code>: A better way</h2>
<p>Although you can manually open and close files, the recommended approach is to utilize <code style="background: #F5F5F5;">with open()</code>.</p>
<p>This will automatically close the file when the code block ends. It is cleaner, safer, and widely used in real Python projects.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("message.txt", "w") as file:
    file.write("This file was written using with open().")

with open("message.txt", "r") as file:
    content = file.read()

print(content)</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>This file was written using with open().</code></pre>
</div>
<p>Using <code style="background: #F5F5F5;">with open()</code> this is best practice because you don&#8217;t have to remember to close the file manually.</p>
</p>
<h2><span># </span>Understanding file modes</h2>
<p>When you open a file, mode tells Python what you want to do with it.</p>
</p>
<table style="width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;">
<thead>
<tr style="background-color: #ffd29a;">
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Mode</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><code style="background: #F5F5F5;">w</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Save to file. Creates a recent file or replaces an existing file.</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><code style="background: #F5F5F5;">a</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Attach to file. Adds content at the end without removing existing content.</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><code style="background: #F5F5F5;">x</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Create a recent file. It will fail if the file already exists.</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><code style="background: #F5F5F5;">r</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Read the file. Failure if file does not exist.</td>
</tr>
</tbody>
</table>
<p>For saving files, the most common modes are <code style="background: #F5F5F5;">w</code> AND <code style="background: #F5F5F5;">a</code>. Employ <code style="background: #F5F5F5;">w</code> when you want to create a recent file or replace existing content. Employ <code style="background: #F5F5F5;">a</code> when you want to add recent content at the end of the file.</p>
</p>
<h2><span># </span>Writing multiple lines</h2>
<p>You can write multiple lines by adding a newline character <code style="background: #F5F5F5;">n</code>.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("notes.txt", "w") as file:
    file.write("Line 1: Learn Pythonn")
    file.write("Line 2: Practice file handlingn")
    file.write("Line 3: Build small projectsn")</code></pre>
</div>
<p>Read file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("notes.txt", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>Line 1: Learn Python
Line 2: Practice file handling
Line 3: Build compact projects</code></pre>
</div>
<p>You can also utilize <code style="background: #F5F5F5;">writelines()</code> save a list of strings to a file.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>tasks = [
    "Write Python coden",
    "Run the notebookn",
    "Check the output filen"
]

with open("tasks.txt", "w") as file:
    file.writelines(tasks)</code></pre>
</div>
<p>Read file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("tasks.txt", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>Write Python code
Run the notebook
Check the output file</code></pre>
</div>
<p>There is one essential thing to remember <code style="background: #F5F5F5;">writelines()</code> does not automatically add line breaks. You must include <code style="background: #F5F5F5;">n</code> myself.</p>
</p>
<h2><span># </span>Appending to a file</h2>
<p>Sometimes you don&#8217;t want to overwrite the existing content of a file. Instead, you can add recent content at the end.</p>
<p>To do this, utilize append mode: <code style="background: #F5F5F5;">a</code>.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("journal.txt", "w") as file:
    file.write("Day 1: I started learning Python file handling.n")

with open("journal.txt", "a") as file:
    file.write("Day 2: I learned how to append text to a file.n")</code></pre>
</div>
<p>Read file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("journal.txt", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>Day 1: I started learning Python file handling.
Day 2: I learned how to append text to a file.</code></pre>
</div>
<p>Append mode is useful when working with logs, logs, reports, or any file to which you want to add recent information.</p>
</p>
<h2><span># </span>Secure file creation</h2>
<p>If you want to create a recent file but avoid overwriting the existing one, utilize <code style="background: #F5F5F5;">x</code> mode.</p>
<p>This mode only creates a file if it does not already exist. If the file already exists, Python calls a <code style="background: #F5F5F5;">FileExistsError</code>.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>try:
    with open("new_file.txt", "x") as file:
        file.write("This file was created using x mode.")
    print("File created successfully.")
except FileExistsError:
    print("The file already exists, so Python did not overwrite it.")</code></pre>
</div>
<p>If the file does not exist, you may see:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>File created successfully.</code></pre>
</div>
<p>If the file already exists, you can see:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>The file already exists, so Python did not overwrite it.</code></pre>
</div>
<p>This is useful when you want to protect existing files from being accidentally overwritten.</p>
</p>
<h2><span># </span>Working with file paths</h2>
<p>By default, Python saves files to the same folder where your notebook or script runs.</p>
<p>If you want to save files to a specific folder, you can utilize <strong><a href="https://docs.python.org/3/library/pathlib.html" target="_blank" rel="noopener">library_path</a></strong>.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>from pathlib import Path

output_folder = Path("output")
output_folder.mkdir(exist_ok=True)

file_path = output_folder / "summary.txt"

with open(file_path, "w") as file:
    file.write("This file was saved inside the output folder.")

print(f"File saved to: {file_path}")</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>File saved to: output/summary.txt</code></pre>
</div>
<p>Now read the file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("output/summary.txt", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>This file was saved inside the output folder.</code></pre>
</div>
<p>The <code style="background: #F5F5F5;">mkdir(exist_ok=True)</code> call creates a folder if it doesn&#8217;t already exist. If the folder already exists, Python does not report an error.</p>
</p>
<h2><span># </span>Saving CSV files</h2>
<p>CSV files are useful for saving tabular data such as rows and columns. They are commonly opened in spreadsheet tools such as Excel or Google Sheets.</p>
<p>To write a CSV file in Python, utilize the method <strong><a href="https://docs.python.org/3/library/csv.html" target="_blank" rel="noopener">csv</a></strong>    module.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>import csv

students = [
    ["Name", "Score"],
    ["Ayesha", 92],
    ["Bilal", 85],
    ["Sara", 88]
]

with open("students.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerows(students)</code></pre>
</div>
<p>Read the CSV file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("students.csv", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>Name,Score
Ayesha,92
Bilal,85
Sara,88</code></pre>
</div>
<p>The <code style="background: #F5F5F5;">newline=""</code> The argument helps avoid extra blank lines when saving CSV files, especially on Windows.</p>
</p>
<h2><span># </span>Saving JSON files</h2>
<p>JSON is another common format for storing structured data. It is often used for dictionaries, API responses, configuration files, and nested data.</p>
<p>To write JSON files in Python, utilize the <strong><a href="https://docs.python.org/3/library/json.html" target="_blank" rel="noopener">json</a></strong>    module.</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>import json

profile = {
    "name": "Ayesha",
    "role": "Data Analyst",
    "skills": ["Python", "SQL", "Excel"],
    "active": True
}

with open("profile.json", "w") as file:
    json.dump(profile, file, indent=4)</code></pre>
</div>
<p>Read the JSON file:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>with open("profile.json", "r") as file:
    print(file.read())</code></pre>
</div>
<p>Exit:</p>
<div style="width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;">
<pre><code>{
    "name": "Ayesha",
    "role": "Data Analyst",
    "skills": [
        "Python",
        "SQL",
        "Excel"
    ],
    "active": true
}</code></pre>
</div>
<p>The <code style="background: #F5F5F5;">indent=4</code> argument makes the JSON file easier to read.</p>
</p>
<h2><span># </span>Common beginner mistakes</h2>
<p>Here are some common mistakes beginners make when writing files in Python.</p>
</p>
<table style="width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;">
<thead>
<tr style="background-color: #ffd29a;">
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Mistake</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">What&#8217;s going on</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">How to fix it</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Forgetting to close a file</td>
<td style="padding: 12px; border: 1px solid #ddd;">Changes may not be saved correctly</td>
<td style="padding: 12px; border: 1px solid #ddd;">Employ <code style="background: #F5F5F5;">with open()</code></td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Using <code style="background: #F5F5F5;">w</code> instead <code style="background: #F5F5F5;">a</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Existing content will be removed</td>
<td style="padding: 12px; border: 1px solid #ddd;">Employ <code style="background: #F5F5F5;">a</code> when joining</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Forgetfulness <code style="background: #F5F5F5;">n</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">The text appears on one line</td>
<td style="padding: 12px; border: 1px solid #ddd;">Add newlines</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Writing to the missing folder</td>
<td style="padding: 12px; border: 1px solid #ddd;">Python reports an error</td>
<td style="padding: 12px; border: 1px solid #ddd;">First create a folder</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Directly writing non-string data</td>
<td style="padding: 12px; border: 1px solid #ddd;">Python can raise a <code style="background: #F5F5F5;">TypeError</code></td>
<td style="padding: 12px; border: 1px solid #ddd;">Convert values ​​to strings or utilize CSV/JSON</td>
</tr>
</tbody>
</table>
<h2><span># </span>Summary</h2>
<p>Writing to files is one of the most useful Python skills for beginners. I still remember entering a programming competition in my second semester of engineering and wasting almost an hour trying to figure out how to save the file. If I had known it was that plain, I could have won.</p>
<p>File saving helps you store logs, save program output, create reports, store user data, and even read and write plain databases using formats like JSON. The best part is that Python file handling is native, speedy, and works out of the box.</p>
<p>For most tasks, utilize <code style="background: #F5F5F5;">with open()</code> because it automatically closes the file. Employ <code style="background: #F5F5F5;">w</code> save or overwrite the file, <code style="background: #F5F5F5;">a</code> to add recent content and <code style="background: #F5F5F5;">x</code> to safely create a recent file without overwriting the existing one.</p>
<p><a href="https://abid.work" rel="noopener" target="_blank"><b><strong><a href="https://abid.work" target="_blank" rel="noopener noreferrer">Abid Ali Awan</a></strong></b></a>    (<a href="https://www.linkedin.com/in/1abidaliawan" rel="noopener" target="_blank">@1abidaliawan</a>) is a certified data science professional who loves building machine learning models. Currently, he focuses on creating content and writing technical blogs about machine learning and data science technologies. Abid holds a Master&#8217;s degree in Technology Management and a Bachelor&#8217;s degree in Telecommunications Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.</p>
</p></div>
<p>The post <a href="https://aisckool.com/how-to-write-to-files-in-python-a-beginners-guide/">How to write to files in Python: a beginner&#8217;s guide</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/how-to-write-to-files-in-python-a-beginners-guide/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i0.wp.com/www.kdnuggets.com/wp-content/uploads/kdn_awan_write_files_python_beginners_guide_feature.png?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>Uranus&#8217;s moons may be the key to finding lost planets</title>
		<link>https://aisckool.com/uranuss-moons-may-be-the-key-to-finding-lost-planets/</link>
					<comments>https://aisckool.com/uranuss-moons-may-be-the-key-to-finding-lost-planets/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Sat, 06 Jun 2026 21:48:40 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27381</guid>

					<description><![CDATA[<p>We have idea of ​​the solar system&#8217;s past: it was full of violence and chaos. However, we are still investigating how brutal this event was. Current models suggest that at some point after the formation of the giant planets, they underwent a phase of such extreme instability that one or even two bodies the size [&#8230;]</p>
<p>The post <a href="https://aisckool.com/uranuss-moons-may-be-the-key-to-finding-lost-planets/">Uranus&#8217;s moons may be the key to finding lost planets</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div>
<p><span class="lead-in-text-callout">We have</span> idea of ​​the solar system&#8217;s past: it was full of violence and chaos. However, we are still investigating how brutal this event was. Current models suggest that at some point after the formation of the giant planets, they underwent a phase of such extreme instability that one or even two bodies the size of Uranus or Neptune were ejected into interstellar space. If such a scenario comes true, we could find clues in the most unexpected places in the solar system, such as the moons of Jupiter and especially Uranus.</p>
<p class="paywall">A recent article published in <a href="https://www.sciencedirect.com/science/article/abs/pii/S0019103526001223?via%3Dihub" class="text link" target="_blank" rel="noopener"><em>Icarus</em></a>    analyzed 122 possible scenarios for such instability to assess how the satellite systems of &#8220;left behind&#8221; planets would respond. The scientists concluded that it would be extremely arduous to explain the current characteristics of Uranus&#8217;s moons without some episode of violent instability. This type of instability only appears in models where there were more giant planets than we see today.</p>
<p class="paywall">The authors indicate that Uranus&#8217;s moons were most likely destabilized at least twice in the past: first by an impact that tilted the planet, and then by a close encounter between the giant planets during a time of instability. This chaos, fueled by the presence of one or more planets that were later ejected, would destroy and rebuild the moon system to the state we see today.</p>
<figure class="AssetEmbedWrapper-iJvQnD cOWUYC asset-embed">
<div class="AssetEmbedAssetContainer-fnduJP iaVSwI asset-embed__asset-container"><span class="SpanWrapper-kFnjvc eKnjjD responsive-asset AssetEmbedResponsiveAsset-gaAbQ hXaxHA asset-embed__responsive-asset"><picture class="ResponsiveImagePicture-jKunQM gjCCFj AssetEmbedResponsiveAsset-gaAbQ hXaxHA asset-embed__responsive-asset responsive-image"></picture></span></div>
<div class="CaptionWrapper-bpPcvW iDPSlt caption AssetEmbedCaption-eZIMNW gMgneI asset-embed__caption" data-testid="caption-wrapper"><span class="BaseText-fEwdHD CaptionText-cQpRdU kRTNAB hbiMYj caption__text"></p>
<p>Miranda, a moon of Uranus considered the most unusual in the solar system.</p>
<p></span><span class="BaseText-fEwdHD CaptionCredit-cUgOGk iQbGEh hRFzlA caption__credit">NASA</span></div>
</figure>
<h2 class="paywall">The solar system and chaos</h2>
<p class="paywall">Jupiter, Saturn, Uranus and Neptune have not always had their current positions in the solar system. According to the planetary instability model, they were born slightly closer to the Sun and closer to each other. After millions of years, they migrated towards their current orbits.</p>
<p class="paywall">However, there are details of this model that do not agree with observations. First, the current orbits of Jupiter and Saturn are eccentric, while there are specific structures, such as the Kuiper Belt, that apparently should have prevented Neptune from moving to its current position. In the simulations, the planets did not get to where they are today.</p>
<p class="paywall">It is therefore possible that the Solar System at one point had more planets and they &#8220;pushed out the others.&#8221; According to this hypothesis, the puzzle of the solar system fits better. The problem is that these bodies, if they existed, are gone &#8211; they were thrown away and left no physical traces or fragments. This leaves the concept of lost planets as a hypothesis awaiting sufficient evidence to be confirmed.</p>
<h2 class="paywall">Extraordinary Moon</h2>
<p class="paywall">Modern <em>Icarus</em> the study tested the missing planets hypothesis using Uranus&#8217;s moons as direct evidence. A total of 122 simulations of the evolution of the Solar System were used. In 85 percent of scenarios, the Uranus lunar system collapses. In only a few scenarios did its moons survive, and in all of them the lost and ejected planets hypothesis fit very well.</p>
<p class="paywall">The report points to Miranda, the smallest moon in the main Uranus system. Astronomers believe it is the most unusual object in the solar system. It&#8217;s patchy, as if sewn together from scraps, too icy for its size, and quite miniature compared to Uranus&#8217; other moons. It is also geologically vigorous.</p>
<p class="paywall">Astronomers believe Miranda is the remnant of a larger body. The study confirms this thesis and suggests that this is the clearest example of traces of planetary instability.</p>
</div>
<p>The post <a href="https://aisckool.com/uranuss-moons-may-be-the-key-to-finding-lost-planets/">Uranus&#8217;s moons may be the key to finding lost planets</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/uranuss-moons-may-be-the-key-to-finding-lost-planets/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i0.wp.com/media.wired.com/photos/6a1f57f22dd56ccdeabd077c/191:100/w_1280,c_limit/GettyImages-1088373686.jpg?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>Deep dive into language model calibration: Platt scaling, isotonic regression, temperature scaling</title>
		<link>https://aisckool.com/deep-dive-into-language-model-calibration-platt-scaling-isotonic-regression-temperature-scaling/</link>
					<comments>https://aisckool.com/deep-dive-into-language-model-calibration-platt-scaling-isotonic-regression-temperature-scaling/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Sat, 06 Jun 2026 12:42:24 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27377</guid>

					<description><![CDATA[<p># Entry A model that claims to be 90% confident should be right 90% of the time. When this relationship falls apart, you will receive incorrect calibration problem. The model results no longer say anything useful about reliability. For enormous language models (LLM) miscalibration is common. AND NAACL 2024 Study found that confidence scores deviated [&#8230;]</p>
<p>The post <a href="https://aisckool.com/deep-dive-into-language-model-calibration-platt-scaling-isotonic-regression-temperature-scaling/">Deep dive into language model calibration: Platt scaling, isotonic regression, temperature scaling</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div id="post-">
<p> </p>
<h2><span># </span>Entry</h2>
<p>A model that claims to be 90% confident should be right 90% of the time. When this relationship falls apart, you will receive <strong>incorrect calibration</strong> problem. The model results no longer say anything useful about reliability.</p>
<p>For <strong>enormous language models</strong> (LLM) miscalibration is common. AND <a href="https://aclanthology.org/2024.naacl-long.366/" target="_blank" rel="noopener">NAACL 2024 Study</a> found that confidence scores deviated from actual correctness rates in QA, code generation, and inference tasks.</p>
<p>Other <a href="https://www.biorxiv.org/content/10.1101/2025.02.11.637373v1.full" target="_blank" rel="noopener">test</a> biomedical models found average calibration scores ranging from just 23.9% to 46.6% across all models tested. The difference is constant.</p>
<p>Standard solution in <strong>classic machine learning</strong> is post-hoc recalibration: fit a uncomplicated function to the set aside validation set to map the raw confidence scores to better calibrated probabilities.</p>
<p><strong>Three</strong> dominant methods: <a href="https://github.com/gpleiss/temperature_scaling" target="_blank" rel="noopener"><strong>temperature scaling</strong></a>, <a href="https://www.blog.trainindata.com/complete-guide-to-platt-scaling/" target="_blank" rel="noopener"><strong>Platt scaling</strong></a>AND <a href="https://en.wikipedia.org/wiki/Isotonic_regression#:~:text=Isotonic%20regression%20is%20used%20iteratively,of%20supervised%20machine%20learning%20models." target="_blank" rel="noopener"><strong>isotonic regression</strong></a>. All three were designed for <a href="https://medium.com/@akankshamalhotra24/generative-classifiers-v-s-discriminative-classifiers-1045f499d8cc" target="_blank" rel="noopener">discriminative classifiers</a>and applying them to LLM requires caution.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-3.png"></p>
</p>
<h2><span># </span>Measurement calibration</h2>
<p>The dominant metric is <a href="https://towardsdatascience.com/expected-calibration-error-ece-a-step-by-step-visual-explanation-with-python-code-c3e9aa12937d/" target="_blank" rel="noopener"><strong>Expected calibration error</strong></a>    (ECG). Groups predictions into confidence intervals, calculates the difference between mean confidence and observed accuracy within each interval, and averages across intervals weighted by size. ECE = 0 is the perfect calibration.</p>
<p>A reliability diagram shows the relationship between confidence and accuracy. A perfectly calibrated model sits on a diagonal. Below is the overconfidence model: The curve shows high confidence, but accuracy cannot keep up.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-4.png"> </p>
<p>AND <a href="https://aejaspan.github.io/posts/2025-09-01-LLM-Clasifier-Confidence-Scores" target="_blank" rel="noopener">Rating 2025</a> GPT-4o-mini as a text classifier found that 66.7% of errors occurred at confidence levels above 80% &#8211; the canonical pattern of overconfidence.</p>
<p>ECE alone is increasingly seen as insufficient. AND <a href="https://arxiv.org/html/2512.16030" target="_blank" rel="noopener">research article</a> recommends pairing ECE with <a href="https://en.wikipedia.org/wiki/Brier_score" target="_blank" rel="noopener">Brier score</a>overconfidence factors and reliability diagrams combined. A single number obscures significant differences in where and how the model behaves incorrectly.</p>
</p>
<h2><span># </span>Why LLMs complicate the standard setup</h2>
<p>The three methods we discuss assume a constant output space. The classifier generates one <strong>probability</strong> per class, and calibration maps them to better estimates.</p>
<p><strong>LLM</strong> don&#8217;t act this way.</p>
<p>There are four complications that matter here.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-5.png"> </p>
<p>The output space is exponentially enormous: confidence cannot be computed at the sequence level. Semantically equivalent outcomes may have very different token-level probabilities. The trust disagrees with the details; AND <a href="https://aclanthology.org/2024.naacl-long.366/" target="_blank" rel="noopener">research article</a> atomic calibration showed that generative models show the lowest average confidence in the middle of the generation, rather than at the beginning or end.</p>
<p>Many LLMs only reveal the probabilities of the highest-k tokens through their <strong>API</strong>therefore, classic calibration approaches that rely on full logit access require modification.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-6.png"> </p>
<h2><span># </span>Applying temperature scaling</h2>
<p>Temperature scaling divides the logit vector by the T scalar before applying softmax. When T > 1, the distribution flattens and confidence decreases. When T  </p>
<p>T fits the validation set held out by minimizing the negative log-likelihood. The method adds one parameter, preserves prediction rankings, and is economical to compute.</p>
<p>The <a href="https://github.com/gpleiss/temperature_scaling" target="_blank" rel="noopener">original formula</a> targeted DenseNet image classifiers. In the case of LLM, temperature controls the probability distribution of the vocabulary at each decoding stage, so the same logic applies.</p>
<p>The problem is this <a href="https://huggingface.co/blog/rlhf" target="_blank" rel="noopener"><strong>Reinforcement learning from human feedback</strong></a>    (RLHF). Post-RLHF models develop input-dependent overconfidence: the degree of miscalibration varies with input, and a single T cannot explain this variation.</p>
<p>Average ECE scores above 0.377 have been documented for models such as GPT-3 on verbalized self-confidence tasks and <a href="https://arxiv.org/html/2505.18658v2" target="_blank" rel="noopener">2025 study</a> confirms that RLHF-tuned models consistently overestimate confidence in all cases.</p>
<p><a href="https://arxiv.org/abs/2409.19817" target="_blank" rel="noopener"><strong>Adaptive temperature scaling</strong></a>    (ATS) deals with this directly. ATS predicts the temperature for each token based on hidden features at the token level, adapting to a supervised tuning dataset rather than using a single fixed T. Researchers confirmed that ATS improved calibration by 10-50% without compromising task performance. For any RLHF-tuned model, ATS provides a stronger baseline than standard temperature scaling.</p>
<p>Standard temperature scaling still works well for pre-RLHF base models. When miscalibration is approximately uniform across all inputs, a single T is often sufficient to correct for systematic over- or under-confidence.</p>
<p>The problem is specific to post-RLHF models, where input-dependent overconfidence means that a single T cannot correct for all inputs.</p>
</p>
<h2><span># </span>Application of Platt scaling</h2>
<p>Platt scaling fits a logistic function based on uncalibrated results: p = σ(A·s + B), where A and B learn from the issued validation set with binary correctness labels.</p>
<p>The sigmoid shape provides a parametric mapping with two free parameters.</p>
<p>Platt scaling was originally developed for SVMs, but can be generalized to any system that produces a scalar confidence score.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-8.png"> </p>
<p>Two-parameter fitting also allows for productive operate of data compared to isotonic regression: it can generate useful estimates from a smaller calibration set, which is crucial in implementation contexts where labeled validity data is circumscribed.</p>
<p>In the context of LLM, Platt scaling works on sequence-level or token-level confidence scores.</p>
<p>AND <a href="https://www.software-lab.org/publications/icse2025_calibration.pdf" target="_blank" rel="noopener">paper</a> based on the code confidence generated by LLM, it was found that Platt scaling produced better calibrated results than uncalibrated results. Another study on LLM for Text to SQL Conversion has been introduced <a href="https://arxiv.org/html/2409.10855v1" target="_blank" rel="noopener"><strong>Platt&#8217;s multidimensional scaling</strong></a>    (MPS), extending single-variable Platt scaling to combine sub-score frequency scores across multiple generated samples – consistently outperforming single-score baselines.</p>
<p>Two <strong>limitations</strong> are documented. First, global Platt scaling at the sequence level is too abrasive for tasks where correctness depends on local editing decisions: a single sigmoid mapping is unable to capture sample-dependent miscalibration patterns.</p>
<p>Additionally, Platt scaling may degrade correct scoring performance for robust models.</p>
</p>
<h2><span># </span>Using isotonic regression</h2>
<p>Isotonic regression is performed using a non-parametric method.</p>
<p>It learns a piecewise constant, monotone, non-decreasing mapping from uncalibrated outcomes to calibrated probabilities using <a href="https://medium.com/@jhimli.c1/unveiling-the-magic-of-pava-a-simple-path-to-monotonic-regression-37f19ffa60df" target="_blank" rel="noopener"><strong>Neighboring pool violators algorithm</strong></a>    (PAWA). There is no assumed shape for the calibration function, which makes it more malleable than Platt scaling when the confidence-accuracy relationship is not sigmoidal.</p>
<p>The piecewise constant output adapts to any monotonous shape: linear, stepped or concave. This adaptability is the main reason why isotonic regression tends to outperform Platt scaling in empirical comparisons.</p>
<p>The cost is the risk of overfitting for miniature calibration sets. Mapping generalizes well only when there is enough data to constrain it.</p>
<p>Empirically, isotonic regression outperforms Platt scaling.</p>
<p>Exacting <a href="https://arxiv.org/html/2509.23665v1" target="_blank" rel="noopener">comparison</a> across multiple datasets and architectures showed that isotonic regression outperforms Platt scaling on ECE and Brier scores with statistical significance, using paired tests with a Bonferroni correction at α = 0.003.</p>
<p><img decoding="async" alt="LLM calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-9.png"> </p>
<p>In this study, the Random Forest baseline improved from a reliability score of 0.8268 in the uncalibrated condition, to 0.9551 in the Platt scale, and to 0.9660 in the isotonic regression condition. Both methods can degrade actual scoring performance for robust models, but the isotonic advantage persists consistently.</p>
<p>For multiclass LLM settings, it has been shown that standard isotonic regression can be further improved with extensions supporting normalization, consistently outperforming both OvR isotonic regression and standard parametric methods on NLL and ECE.</p>
<p>The data requirement is a binding restriction. The advantage of isotonic regression is real, but it does not translate to low-data deployment scenarios.</p>
</p>
<h2><span># </span>What literature leaves open</h2>
<p>Three <strong>gaps</strong> it is worth marking them before implementing any of these methods.</p>
<p>The <strong>RLHF</strong> the interaction was only examined in terms of temperature scaling. How <strong>Platt scaling</strong> and isotonic regression in post-RLHF models has not been systematically tested. <strong>ATS</strong> exists because standard temperature scaling required an explicit solution in this case. It is an open question whether the other two methods require similar extensions.</p>
<p><img decoding="async" alt="LLM Calibration" width="100%" class="perfmatters-lazy" src="https://www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-10.png"> </p>
<p>The most direct <strong>comparisons</strong> all three methods are taken from the general machine learning calibration literature. LLM-specific benchmarks that test all three directly are uncommon. ICSE Code Calibration 2025 <a href="https://www.software-lab.org/publications/icse2025_calibration.pdf" target="_blank" rel="noopener">paper</a> is one of the few and its scope is circumscribed to code generation.</p>
<p>The size of the calibration set is a real implementation limitation. The isotonic regression results from the articles assume that the datasets are enormous enough to limit mapping. In production with a circumscribed number of labeled examples, the gap between isotonic regression and Platt&#8217;s scale may close or reverse.</p>
</p>
<h2><span># </span>Application</h2>
<p><strong>Temperature scaling</strong> is the right starting point for most teams. For entry-level models without RLHF, a single T is often sufficient.</p>
<p>For <strong>RLHF</strong>-tuned models, switch to ATS: per-token temperature supports input-dependent overconfidence that the global scalar lacks.</p>
<p><strong>Platt scaling</strong> is a practical choice when the calibration set is miniature or when calibration requires inclusion in a larger pipeline. It is productive in data processing and uncomplicated to implement. The limitation is scope: it cannot capture miscalibration, which varies from sample to sample and tends to degrade performance for robust models.</p>
<p><strong>Isotonic regression</strong> has the strongest empirical track record of the three. Utilize it when the calibration set is enormous enough to constrain the mapping without overfitting, and combine it with extensions that support normalization in multiclass settings.</p>
<p>The decision that comes before all this is what &#8220;<strong>trust</strong>&#8221; means for the task. Symbolic probability, sequence probability, verbalized confidence, and consistency across samples can all produce different values ​​for the same result. A calibration method applied to the wrong signal does not improve reliability. The correct definition is a prerequisite for any of the above methods to work.</p>
<p><a href="https://twitter.com/StrataScratch" rel="noopener" target="_blank"><b><strong><a href="https://twitter.com/StrataScratch" target="_blank" rel="noopener noreferrer">Nate Rosidi</a></strong></b></a>    is a data scientist and product strategist. He is also an adjunct professor of analytics and the founder of StrataScratch, a platform that helps data scientists prepare for job interviews using real interview questions from top companies. Nate writes about the latest career trends, gives interview advice, shares data science projects, and discusses all things SQL.</p>
</p></div>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>The post <a href="https://aisckool.com/deep-dive-into-language-model-calibration-platt-scaling-isotonic-regression-temperature-scaling/">Deep dive into language model calibration: Platt scaling, isotonic regression, temperature scaling</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/deep-dive-into-language-model-calibration-platt-scaling-isotonic-regression-temperature-scaling/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i1.wp.com/www.kdnuggets.com/wp-content/uploads/Rosidi_LLM-Calibration-1.png?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>The US has a plan to combat snails. It covers many more flies</title>
		<link>https://aisckool.com/the-us-has-a-plan-to-combat-snails-it-covers-many-more-flies/</link>
					<comments>https://aisckool.com/the-us-has-a-plan-to-combat-snails-it-covers-many-more-flies/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Sat, 06 Jun 2026 03:40:48 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27373</guid>

					<description><![CDATA[<p>Carnivorous parasite A fly that poses a solemn threat to livestock has returned to the United States after 60 years. This week, the US Department of Agriculture confirmed presence of the Fresh World snail in a calf in south Texas. Eliminated in the US in 1966 and as far south as possible Panama by 2006, [&#8230;]</p>
<p>The post <a href="https://aisckool.com/the-us-has-a-plan-to-combat-snails-it-covers-many-more-flies/">The US has a plan to combat snails. It covers many more flies</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div>
<p><span class="lead-in-text-callout">Carnivorous parasite</span> A fly that poses a solemn threat to livestock has returned to the United States after 60 years. This week, the US Department of Agriculture <a href="https://www.aphis.usda.gov/news/agency-announcements/usda-confirms-presence-new-world-screwworm-united-states" class="text link" target="_blank" rel="noopener">confirmed</a> presence of the Fresh World snail in a calf in south Texas.</p>
<p class="paywall">Eliminated in the US in 1966 and as far south as possible <a data-offer-url="https://asm.org/articles/2025/september/new-word-screwworm-rise-fall-resurgence" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://asm.org/articles/2025/september/new-word-screwworm-rise-fall-resurgence&quot;}" href="https://asm.org/articles/2025/september/new-word-screwworm-rise-fall-resurgence" rel="nofollow noopener" target="_blank">Panama</a> by 2006, its recent re-emergence in Mexico increased the likelihood that the snail would eventually re-enter the country, and modeling showed it could arrive as early as summer 2025. It took a little longer, but the snail arrived. To prevent an outbreak, officials are using a proven technique: releasing masses of adult snail flies.</p>
<p class="paywall">A worm infection occurs when a female fly lays eggs in open wounds or other body parts of warm-blooded animals. Once the eggs hatch, worms emerge and feed on living tissue before turning into flies. As adults, snail flies do not bite or feed on flesh. Scientists in the 1930s and 1940s thought that if they could prevent female flies from breeding, they could break the cycle. At the time, Fresh World snails were killing hundreds of thousands of cattle each year, mostly in the American South and Southwest.</p>
<p class="paywall">In the 1950s, USDA researchers made a breakthrough by administering radiation to male snails and rendering them sterile. Once released into an infected area, sterile males mate with wild female insects and lay nonviable eggs. No offspring are born and the population collapses. Known as the sterile insect technique, it was first successfully used on the island of Curaçao off the coast of Venezuela. It took just seven weeks to eradicate the pest, and the effort saved the island&#8217;s goat herds, which were a vital food source.</p>
<p class="paywall">This technique takes advantage of the fact that female Fresh World snails mate only once in their lives. “The sterile insect technique is perhaps the most telling example of a completely effective biological control mechanism,” says Sally DeNotta, associate professor of veterinary medicine at the University of Florida. &#8220;The life cycle stops. No offspring are produced. It was very successful.&#8221;</p>
<p class="paywall">For years, a dense stretch of rainforest between Panama and Colombia known as the Darién Pass served as a biological barrier through which sterile flies were released to prevent the snails from spreading north. However, insects began to break through this barrier in 2022.</p>
<p class="paywall">To prevent an outbreak in south Texas, the USDA has sealed off an approximately 12-mile area around the infected calf and is conducting targeted releases of sterile snail flies from trucks. This is in addition to the 4 million sterile flies a week already dropped in the area. The agency predicts the snail will move north in February <a href="https://www.aphis.usda.gov/news/agency-announcements/usda-shifts-sterile-fly-dispersal-efforts-defend-us-border" class="text link" target="_blank" rel="noopener">moved</a> its efforts to disperse 100 million sterile flies per week to focus on an area along the U.S.-Mexico border.</p>
<p class="paywall">“While this development poses a serious threat to our livestock and wildlife, it does not surprise us,” USDA Secretary Brooke Rollins said during a hearing of the House Agriculture Committee <a href="https://www.c-span.org/program/house-committee/agriculture-secretary-brooke-rollins-testifies-on-usda-policy/680468" class="text link" target="_blank" rel="noopener">meeting</a> on Thursday.</p>
<p class="paywall">She said it takes about 400 million flies a week to repel the bugs. Currently, only about 100 million flies can be produced per week in the United States <a href="https://www.aphis.usda.gov/livestock-poultry-disease/stop-screwworm/sterile-fly-production-dispersal-facilities" class="text link" target="_blank" rel="noopener">facility located in Panama</a>.</p>
<p class="paywall">A sterile insect facility in Mexico closed in 2012, but the USDA did <a href="https://www.aphis.usda.gov/livestock-poultry-disease/stop-screwworm/sterile-fly-production-dispersal-facilities" class="text link" target="_blank" rel="noopener">investing $21 million</a> Assisting in the renovation and conversion of an existing fruit fly facility in Metapa, Mexico to produce an additional 60-100 million sterile flies per week. According to the USDA, the facility is expected to be operational this summer.</p>
</div>
<p>The post <a href="https://aisckool.com/the-us-has-a-plan-to-combat-snails-it-covers-many-more-flies/">The US has a plan to combat snails. It covers many more flies</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/the-us-has-a-plan-to-combat-snails-it-covers-many-more-flies/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i1.wp.com/media.wired.com/photos/6a230980315a6d7e7d3fe38d/191:100/w_1280,c_limit/How-US-Plans-to-Stop-Screwworm-Outbreak-Science-2195014611.jpg?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>7 steps to master time series analysis in Python</title>
		<link>https://aisckool.com/7-steps-to-master-time-series-analysis-in-python/</link>
					<comments>https://aisckool.com/7-steps-to-master-time-series-analysis-in-python/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Fri, 05 Jun 2026 18:39:38 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27365</guid>

					<description><![CDATA[<p># Entry # Step 1: Understand what makes time series data special The three most critical structural properties are summarized below: Property What does it mean Why it matters Time dependency The observations are not independent; what happened yesterday has relevance to today Standard machine learning problems assume independence of rows, so a naive application [&#8230;]</p>
<p>The post <a href="https://aisckool.com/7-steps-to-master-time-series-analysis-in-python/">7 steps to master time series analysis in Python</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div id="post-">
<p> </p>
<h2><span># </span>Entry</h2>
</p>
<h2><span># </span>Step 1: Understand what makes time series data special</h2>
<p>The three most critical structural properties are summarized below:</p>
</p>
<table style="width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;">
<thead>
<tr style="background-color: #ffd29a;">
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Property</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">What does it mean</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Why it matters</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Time dependency</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>The observations are not independent; what happened yesterday has relevance to today
</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Standard machine learning problems assume independence of rows, so a naive application produces misleading results
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Stationary</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Statistical properties remain constant over time
</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Most classical models require stationarity; most real world series lack this and require differentiation or transformation
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Seasonality and trend</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Regularly repeating patterns or <strong>seasonality</strong> combined with long distance directional traffic or <strong>tendency</strong>
</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Separating them from the irregular remainder is often a major analytical challenge
</td>
</tr>
</tbody>
</table>
</p>
<h2><span># </span>Step 2: Master time series data structures in Python</h2>
<p>The distinction between <strong><a href="https://pandas.pydata.org/docs/user_guide/timeseries.html#dateoffset-objects" target="_blank" rel="noopener">DatetimeIndex and PeriodIndex</a></strong>    is more critical than it initially seems.</p>
<ul>
<li><code>DatetimeIndex</code>    represents specific moments in time.
</li>
<li><code>PeriodIndex</code>    represents time intervals.
</li>
</ul>
<p>Knowing when to employ each of them, how to convert between them, and how to parse, cut, and resample time-indexed data can save you a lot of trouble later, as most modeling libraries have their own specific format requirements.</p>
<p>Resampling and aggregation are where many analysts make mute, significant errors. Downsampling from minute to hourly data requires selecting the correct aggregation function, and incorrectly specifying it disrupts the analysis. Practicing resampling with multiple aggregation strategies on the same dataset until the logic becomes intuitive is time well spent.</p>
<p><strong><a href="https://pandas.pydata.org/docs/user_guide/window.html" target="_blank" rel="noopener">Roll-up and roll-out windows</a></strong>    — <code>.rolling()</code> AND <code>.expanding()</code> — are pandas primitives for latency features and cumulative statistics. Manually building moving averages, standard deviations, and lag offsets before relying on library abstractions is critical: understanding what these operations do at the index level prevents a whole class of subtle data leak errors that are extremely challenging to diagnose after the fact.</p>
<p><strong>Rescue</strong>: Work through <strong><a href="https://pandas.pydata.org/docs/user_guide/timeseries.html" target="_blank" rel="noopener">pandas Guide to time series and date functionality</a></strong>    with the actual data set before continuing.</p>
</p>
<h2><span># </span>Step 3: Learning how to pristine and prepare time series data</h2>
<ul>
<li>Global statistical thresholds may ignore anomalies in non-stationary series.
</li>
<li>Rolling Z-scores and IQR boundaries in sliding windows support detect anomalous values ​​in their local neighborhood.
</li>
<li>For multi-dimensional sensor data <strong><a href="https://scikit-learn.org/stable/modules/outlier_detection.html#isolation-forest" target="_blank" rel="noopener">Insulating forest</a></strong>    detects anomalies that may not appear in individual channels but appear in connected functions.
</li>
</ul>
<p><strong>Rescue</strong>: : <strong><a href="https://www.sktime.net/en/stable/api_reference/transformations.html" target="_blank" rel="noopener">sktime transformation documentation</a></strong>    covers the most common preprocessing transformations with helpful examples.</p>
</p>
<h2><span># </span>Step 4: Developing intuition through exploratory analysis</h2>
<ul>
<li>Is the trend linear or non-linear?
</li>
<li>Is the seasonal amplitude stable or does it change over time?
</li>
<li>Is the residue approximately white noise, or does it contain structure that the decomposition missed?
</li>
</ul>
<p>Another critical diagnostic is autocorrelation analysis. Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots are imperative tools for understanding time relationships:</p>
<ul>
<li>A slowly decaying ACF signals non-stationarity.
</li>
<li>Significant spikes in hourly data with a 24-hour delay signal daily seasonality.
</li>
<li>PACF cutoff values ​​suggest an autoregressive (AR) order.
</li>
</ul>
<p>Fluent reading of these charts is imperative in any classic modeling work.</p>
<p>Stationarity testing complements the exploratory workflow. The <strong><a href="https://www.statsmodels.org/dev/examples/notebooks/generated/stationarity_detrending_adf_kpss.html" target="_blank" rel="noopener">Augmented Dickey-Fuller (ADF) test and Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test</a></strong>    they provide statistical evidence for or against stationarity, and it is worthwhile to conduct both because they test complementary hypotheses. The results indicate whether differentiation or transformation is needed before modeling.</p>
</p>
<h2><span># </span>Step 5: Construction of classic statistical forecast models</h2>
<p><strong>Rescue</strong>: : <a href="https://otexts.com/fpp3/expsmoothing.html" target="_blank" rel="noopener">Forecasting: Principles and Practice, Chapters 7–9</a> for ETS and ARIMA and <strong><a href="https://www.statsmodels.org/stable/statespace.html" target="_blank" rel="noopener">statsmodels State space documentation</a></strong>    for details on the Python-specific implementation.</p>
</p>
<h2><span># </span>Step 6: Move to machine learning and deep learning models</h2>
<p>Tree-based models such as <strong><a href="https://lightgbm.readthedocs.io/en/stable/" target="_blank" rel="noopener">Lightweight GBM</a></strong>    AND <strong><a href="https://xgboost.readthedocs.io/en/stable/" target="_blank" rel="noopener">XGBoost</a></strong>    generate powerful forecasts by taking into account well-designed lag functions, rolling statistics and calendar variables. They automatically deal with non-linearity and interactions between functions, but the main risk is data leakage; delays must be constructed solely based on past values ​​relative to the prediction timestamp. sktime <code>make_reduction</code> safely wraps scikit-learn regressors as predictors and handles this accounting correctly.</p>
<p><strong><a href="https://otexts.com/fpp3/nnetar.html" target="_blank" rel="noopener">Deep learning architectures</a></strong>    have the best track record on benchmark datasets and perform better at multi-season, covariate and long-term forecasting than classical models. NeuralForecast implements all this with a consistent API and appropriate short-lived cross-validation support. The right time to turn to deep learning is after simpler models have stabilized, not before.</p>
<p><strong>Rescue</strong>: : <strong><a href="https://www.kaggle.com/competitions/m5-forecasting-accuracy/code" target="_blank" rel="noopener">Kaggle M5 Forecasting competition notebooks</a></strong>    are a good starting point, and <a href="https://www.kaggle.com/competitions/m5-forecasting-accuracy/code?competitionId=18599&#038;sortBy=voteCount&#038;excludeNonAccessedDatasources=true" target="_blank" rel="noopener">the best solutions</a> they cover the entire process from feature engineering to assembly based on a real-world retail forecasting problem and are publicly available.</p>
</p>
<h2><span># </span>Step 7: Implementation and monitoring of forecasting systems</h2>
<p>Forecast storage and versioning require thoughtful design. Manufacturing forecasting systems generate forecasts continuously, and storing forecasts along with predicted facts – not just the final model results – allows you to calculate retrospective accuracy over each time horizon and understand exactly where the model is deteriorating over time.</p>
<p>Backtesting as a gateway to implementation is the discipline that separates experiments from production-ready systems. Before any model is implemented, exacting backtesting should simulate the entire implementation window using only data that would be available at each stage. A model that looks good on the exposed test set but doesn&#8217;t backtest properly is not ready.</p>
<p><strong>Rescue</strong>: : <strong><a href="https://www.evidentlyai.com/ml-in-production/model-monitoring" target="_blank" rel="noopener">Apparently an AI model monitoring guide</a></strong>    for machine learning monitoring, including data drift detection and predictions.</p>
</p>
<h2><span># </span>Summary</h2>
</p>
<table style="width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;">
<thead>
<tr style="background-color: #ffd29a;">
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Step</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Why it matters</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Basic properties of time series data</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Without understanding time dependencies, stationarity and seasonality, each subsequent decision is based on shaky ground
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Pandas time-aware data structures</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Correct indexing, resampling, and windowing operations are prerequisites for any analysis and modeling task
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Cleaning and preparation</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Errors introduced here propagate silently throughout the pipeline; the temporal ordering makes them harder to catch than tabular cleaning
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Exploratory analysis</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Distribution, autocorrelation plots, and stationarity tests reveal structure that determines which models are appropriate
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Classic statistical models</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>Enforces structured engagement with data; often competitive with elaborate approaches and always useful as a reference
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Machine learning and deep learning models</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>It expands the possibilities with non-linear patterns, prosperous feature sets and immense sets of series after understanding the classic baselines
</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;">Implementation and monitoring</td>
<td style="padding: 12px; border: 1px solid #ddd;">
<p>A model that cannot be kept in production is not a finished product; time series systems require domain-specific operational discipline
</td>
</tr>
</tbody>
</table>
<p><b><a href="https://twitter.com/balawc27" rel="noopener" target="_blank"><strong><a href="https://www.kdnuggets.com/wp-content/uploads/bala-priya-author-image-update-230821.jpg" target="_blank" rel="noopener noreferrer">Priya C&#8217;s girlfriend</a></strong></a></b>    is a software developer and technical writer from India. He likes working at the intersection of mathematics, programming, data analytics and content creation. Her areas of interest and specialization include DevOps, data analytics and natural language processing. She likes reading, writing, coding and coffee! He is currently working on learning and sharing his knowledge with the developer community by writing tutorials, guides, reviews, and more. Bala also creates fascinating resource overviews and coding tutorials.</p>
</p></div>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>The post <a href="https://aisckool.com/7-steps-to-master-time-series-analysis-in-python/">7 steps to master time series analysis in Python</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/7-steps-to-master-time-series-analysis-in-python/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i1.wp.com/www.kdnuggets.com/wp-content/uploads/kdn-7-steps-time-series-analyis.png?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>I don&#8217;t want to alarm anyone, but carnivorous snails have arrived in the US</title>
		<link>https://aisckool.com/i-dont-want-to-alarm-anyone-but-carnivorous-snails-have-arrived-in-the-us/</link>
					<comments>https://aisckool.com/i-dont-want-to-alarm-anyone-but-carnivorous-snails-have-arrived-in-the-us/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Fri, 05 Jun 2026 09:38:26 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27349</guid>

					<description><![CDATA[<p>On Wednesday evening, the U.S. Department of Agriculture announced that a case of the Novel World snail had been confirmed in south Texas. This is the first violation of the US-Mexico border detected by the US voracious, carnivorous fliesthat have they climbed up through Central America for several years. IN social media post on Wednesday [&#8230;]</p>
<p>The post <a href="https://aisckool.com/i-dont-want-to-alarm-anyone-but-carnivorous-snails-have-arrived-in-the-us/">I don&#8217;t want to alarm anyone, but carnivorous snails have arrived in the US</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div>
<p>On Wednesday evening, the U.S. Department of Agriculture announced that a case of the Novel World snail had been confirmed in south Texas. This is the first violation of the US-Mexico border detected by the US <a href="https://arstechnica.com/health/2025/05/screwworms-are-coming-and-theyre-just-as-horrifying-as-they-sound/" class="text link" target="_blank" rel="noopener">voracious, carnivorous flies</a>that have <a href="https://arstechnica.com/health/2025/09/flesh-eating-parasite-just-70-miles-from-us-check-pets-texas-officials-say/" class="text link" target="_blank" rel="noopener">they climbed up</a> through Central America for several years.</p>
<p class="paywall">IN <a data-offer-url="https://x.com/USDA/status/2062245310689345981" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://x.com/USDA/status/2062245310689345981&quot;}" href="https://x.com/USDA/status/2062245310689345981" rel="nofollow noopener" target="_blank">social media post on Wednesday afternoon</a>The USDA disclosed that the Texas sample was sent to the National Veterinary Services Laboratories (NVSL) in Ames, Iowa, for confirmation of worm infection. Secretary of Agriculture Brooke Rollins later posted it <a data-offer-url="https://x.com/SecRollins/status/2062344848431018088" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://x.com/SecRollins/status/2062344848431018088&quot;}" href="https://x.com/SecRollins/status/2062344848431018088" rel="nofollow noopener" target="_blank">the examination confirmed the infection</a>that was found in a three-week-old calf in Zavala County, Texas.</p>
<p class="paywall">News of the worm&#8217;s detection was already building this week, sending shockwaves through the U.S. cattle industry.</p>
<p class="paywall">Although many animals, including humans, can be victims of the parasite, the snail is particularly threatening to farm animals. Female snails lay hundreds of eggs in the wounds and holes of warm-blooded creatures, allowing their larvae to feed on living animals, causing deep, festering, life-threatening wounds. Although the snail was once endemic to the U.S., it was extirpated in the 1960s as a result of years of control efforts. The USDA estimates that keeping snails out of the United States will <a href="https://www.nal.usda.gov/exhibits/speccoll/exhibits/show/stop-screwworms--selections-fr/introduction" class="text link" target="_blank" rel="noopener">saved the livestock industry $900 million a year</a>.</p>
<p class="paywall">But the fly had broken through the controls in Central America and was getting closer. According to the USDA, on May 28, a case of the infection was detected within 25 miles of the border in a five-year-old goat in Coahuila, Mexico. The case was one of many detected in recent days, including one in a calf just 60 km from the border, also in Coahuila.</p>
<h2 class="paywall">Disputed findings</h2>
<p class="paywall">In a media call Tuesday, Agriculture Secretary Brooke Rollins said, &#8220;There&#8217;s no question that this is a very, very serious threat to our livestock.&#8221; But she also disputed claims that the fly is closer or even already in the US.</p>
<p class="paywall">On Monday, state Rep. Don McLaughlin claimed on social media that the snailworm case was rightly found <a data-offer-url="https://x.com/donfortexas/status/2061504920021241934" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://x.com/donfortexas/status/2061504920021241934&quot;}" href="https://x.com/donfortexas/status/2061504920021241934" rel="nofollow noopener" target="_blank">one mile from the Texas border</a>which Rollins and <a data-offer-url="https://x.com/USDA/status/2061612189287354459" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://x.com/USDA/status/2061612189287354459&quot;}" href="https://x.com/USDA/status/2061612189287354459" rel="nofollow noopener" target="_blank">The USDA denied it</a>.</p>
<p class="paywall">“When false information comes to light, there is huge panic.” <a data-offer-url="https://www.texastribune.org/2026/06/02/texas-screwworm-1-mile-brooke-rollins-don-mclaughlin/" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://www.texastribune.org/2026/06/02/texas-screwworm-1-mile-brooke-rollins-don-mclaughlin/&quot;}" href="https://www.texastribune.org/2026/06/02/texas-screwworm-1-mile-brooke-rollins-don-mclaughlin/" rel="nofollow noopener" target="_blank">Rollins said Tuesday, according to the Texas Tribune</a>. “And rightfully so, especially when it comes from elected officials and the media.”</p>
<p class="paywall">on Wednesday, <a href="https://www.reuters.com/business/healthcare-pharmaceuticals/unconfirmed-us-case-flesh-eating-screwworm-rattles-cattle-markets-traders-say-2026-06-03/" class="text link" target="_blank" rel="noopener">Reuters reported that McLaughlin suspected the fly was already here</a>. He said samples taken Tuesday from two calves at a ranch in La Pryor, Texas &#8211; in Zavala County, where the worm infection was confirmed &#8211; were being tested for possible snail infection. One infection is said to have occurred in the umbilical cord wound of one of the calves. McLaughlin said he had seen photos and videos of the animals and that the larvae in the photos looked like snail larvae.</p>
<p class="paywall">Reuters was shown one photo that it said showed &#8220;multiple snail-like larvae inside a bloody circular wound on the animal,&#8221; but said it &#8220;could not immediately verify the photo.&#8221;</p>
<p class="paywall">“At this point, it is unconfirmed that it is a New World snail,” McLaughlin said Wednesday. “It looks like it, but it&#8217;s unconfirmed.”</p>
<p class="paywall">The USDA said the discovery has now been confirmed <a href="https://www.aphis.usda.gov/news/agency-announcements/usda-confirms-presence-new-world-screwworm-united-states" class="text link" target="_blank" rel="noopener">press release on Wednesday evening</a> that it is forming a &#8220;unified incident command team&#8221; with the Texas Animal Health Commission and sending response personnel to the area. It also establishes a 20-kilometer zone around detected infection for quarantine, movement restrictions and increased surveillance and fly trapping.</p>
<h2 class="paywall">Back Screw</h2>
<p class="paywall">The snails were wiped out in the US in the 1960s as a result of concerted efforts to wipe out their populations. This is done by aerial bombardment of sterile male flies, which is the most effective weapon against parasites. The mass release of duds displaces fertile males, preventing them from mating with females, which typically mate only once.</p>
<p class="paywall">Thanks to this method, called the sterile insect technique, flies were eradicated not only in the US but throughout Central America. In 2006, they were declared exterminated in Panama.</p>
</div>
<p>The post <a href="https://aisckool.com/i-dont-want-to-alarm-anyone-but-carnivorous-snails-have-arrived-in-the-us/">I don&#8217;t want to alarm anyone, but carnivorous snails have arrived in the US</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/i-dont-want-to-alarm-anyone-but-carnivorous-snails-have-arrived-in-the-us/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i3.wp.com/media.wired.com/photos/6a21dc78733250a1d799491a/191:100/w_1280,c_limit/GettyImages-2212569697%20(1).jpg?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>What the age of the agent means for data science</title>
		<link>https://aisckool.com/what-the-age-of-the-agent-means-for-data-science/</link>
					<comments>https://aisckool.com/what-the-age-of-the-agent-means-for-data-science/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Fri, 05 Jun 2026 00:37:31 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27341</guid>

					<description><![CDATA[<p># Entry Something has changed at the intersection of AI and data science, and it has changed the way practitioners work. The systems currently implemented do not just generate a reaction and stop. They are planning. They carry out multi-stage tasks. They invoke external tools, evaluate their own results, and return when the results are [&#8230;]</p>
<p>The post <a href="https://aisckool.com/what-the-age-of-the-agent-means-for-data-science/">What the age of the agent means for data science</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div id="post-">
<p> </p>
<h2><span># </span>Entry</h2>
<p>Something has changed at the intersection of AI and data science, and it has changed the way practitioners work. The systems currently implemented do not just generate a reaction and stop. They are planning. They carry out multi-stage tasks. They invoke external tools, evaluate their own results, and return when the results are insufficient.</p>
<p>We are no longer entering the era of agents. We live in it. This period is defined by artificial intelligence systems performing autonomous, goal-directed behaviors and has redefined what data scientists actually do every day.</p>
<p>The position has always required a occasional combination of statistical thinking, programming skills and domain knowledge. The fourth dimension is now the benchmark: the ability to design, implement, and evaluate systems that operate independently on behalf of users. Ignore this change and your productivity will lag behind other employees. Take it seriously and your effectiveness will raise in everything you touch.</p>
</p>
<h2><span># </span>A novel definition of baseline</h2>
<p>To understand what&#8217;s at stake, let&#8217;s look at what an AI agent actually does in a production environment today. An agent is a system that perceives its environment, justifies its next move, takes action using available tools and evaluates the results.</p>
<p>Unlike the classic substantial language (LLM) interaction model, where you submit a prompt and receive a single inert response, the agent operates in continuous, iterative loops. It receives a goal, selects a tool, observes the outcome, updates its reasoning, and either pivots or pushes forward. This cycle may involve dozens of separate steps behind the scenes.</p>
<p>What sets this paradigm apart is its native tool integration. In today&#8217;s data science context, an agent can ingest a dataset, browse it, perform exploratory analysis, train a base model, evaluate the results, and generate a structured report &#8211; all without human intervention in the procedural steps.</p>
</p>
<h2><span># </span>Orchestration ecosystem</h2>
<p>The frameworks that make this possible have evolved from experimental libraries to production-grade orchestrators. They all work on the same basic principle &#8211; providing the model with structured access to tools and an inference engine to employ them &#8211; but take different approaches depending on the workflow.</p>
</p>
<table style="width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;">
<thead>
<tr style="background-color: #ffd29a;">
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Structure</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Design philosophy</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Basic data science employ case</th>
<th style="padding: 12px; border: 1px solid #ddd; text-align: left;">Context 2026</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><strong><a href="https://www.langchain.com/langgraph" target="_blank" rel="noopener">LangGraf</a></strong></td>
<td style="padding: 12px; border: 1px solid #ddd;">Graph-based workflow orchestration.</td>
<td style="padding: 12px; border: 1px solid #ddd;">Intricate, conditional pipelines requiring state management.</td>
<td style="padding: 12px; border: 1px solid #ddd;">The industry standard for production-level workflows, both single and multi-agent, where explicit state management and conditional branching are required.</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><strong><a href="https://microsoft.github.io/autogen/" target="_blank" rel="noopener">AutoGen</a></strong></td>
<td style="padding: 12px; border: 1px solid #ddd;">Patterns of multi-agent conversation.</td>
<td style="padding: 12px; border: 1px solid #ddd;">Collaborative scenarios in which agents discuss or verify results.</td>
<td style="padding: 12px; border: 1px solid #ddd;">Good fit with built-in review steps where the critic agent checks the coding agent&#8217;s reasoning. Note: The v0.2 and v0.4/AG2 architectures are significantly different, so check which version your documentation covers before delving into the details.</td>
</tr>
<tr>
<td style="padding: 12px; border: 1px solid #ddd;"><strong><a href="https://github.com/huggingface/smolagents" target="_blank" rel="noopener">smolagents</a></strong></td>
<td style="padding: 12px; border: 1px solid #ddd;">Minimalist code-driven execution.</td>
<td style="padding: 12px; border: 1px solid #ddd;">Code-intensive tasks using the full Python science stack.</td>
<td style="padding: 12px; border: 1px solid #ddd;">A natural fit for data scientists who are already comfortable with neat Python environments.</td>
</tr>
</tbody>
</table>
<h2><span># </span>Change of workflow: from procedural to evaluative</h2>
</p>
<p>The most direct impact on everyday work is the automation of routine processes. Apply a standard exploratory data analysis (EDA) pipeline. Data analyst used to manually import data, generate summary statistics, visualize distributions, and look for outliers. Today, a well-designed agent performs each of these steps as instructed, documents observations in structured formats, and flags anomalies for human review.</p>
<p>This also applies to machine learning engineering. Pipelines that once required manual iteration of preprocessing choices, model selection, and hyperparameter tuning are now largely managed through agent-based orchestration, which reduces—but does not eliminate—the need for human judgment at key decision points.</p>
<p>This last part is crucial. This does not eliminate the data scientist. It changes the role towards higher order decisions. Agents take on the procedural burden; you retain evaluative weight. Agents deal with repeating &#8220;how to do it again&#8221; questions that consume hours of time. You are making a &#8220;is this right&#8221; judgment that no model can replicate.</p>
</p>
<h2><span># </span>Skill stack for 2026</h2>
<p>Technical proficiency in Python, statistics and machine learning remains an irreducible foundation. However, the agentic reality requires a novel level of competences built on this basis.</p>
<ul>
<li><strong>System design and rapid engineering:</strong> Agents follow instructions, and the architecture of these instructions sets an upper limit on print quality. This goes far beyond writing a clear prompt. When you design an agent, you make decisions that determine its behavior based on hundreds of different inputs: how to break down a high-level goal into executable subtasks, how to define constraints so that the agent doesn&#8217;t fill in the gaps itself, and how to specify output formats so that subsequent steps can employ the results without ambiguity. Treat rapid engineering the same way you treat software design. Enter your prompts, test them against edge cases, and document your reasoning. A tooltip that works on ten examples but breaks on the eleventh is not ready for implementation.
</li>
<li><strong>Tool design and integration:</strong> Agents are only as capable as the tools they can employ. A tool is any function that an agent can invoke to interact with the outside world: a database query, a web scraper, an API call, or a script that runs a statistical test. If your tool silently accepts invalid input or returns ambiguous output, the agent will propagate these errors in each subsequent step. Good tool design means typed input, structured error messages that the agent can reason out, and consistent return formats. Think of each tool as a contract: here&#8217;s what I accept, here&#8217;s what I give back, here&#8217;s what happens when something goes wrong.
</li>
<li><strong>Agent Observability:</strong> When an agent performs a long chain of sequential steps, debugging requires a structured evaluation framework. Agent failures are often not obvious. A classic software bug causes an error on a specific line. An agent failure may look like a perfectly reasonable sequence of steps that, a few steps later, produces a slightly erroneous result. Without tracking, there is no way to reconstruct what actually happened. At a minimum, record the inputs and outputs of each tool invocation, the agent&#8217;s reasoning at each decision point, and the final result along with the original goal. Tools like <strong><a href="https://www.langchain.com/langsmith" target="_blank" rel="noopener">LangSmith</a></strong>    AND <strong><a href="https://langfuse.com/" target="_blank" rel="noopener">Langfus</a></strong>    it&#8217;s worth knowing here. With this data, you can create systematic assessments and identify where an agent tends to stray off track.
</li>
<li><strong>Multi-agent architecture:</strong> Intricate tasks are routinely distributed among specialized agents &#8211; such as the data extractor, statistical analyzer, and report generator. The reason is not newness; you modularize the code for the same reason. Specialized components are easier to test and easier to justify in isolation. The design challenge is coordination. Agents must communicate information to each other in a way that is consistent throughout the pipeline, which means defining clear interfaces between agents up front. The decision on how to handle failures must also be made at the design stage: if one of the agents fails halfway through, will the system retry, roll back, or reveal the failure to a checker? Getting this right at the beginning can save you a lot of rework later.
</li>
</ul>
<h2><span># </span>Evolution of roles</h2>
<p>None of this eliminates data science jobs. It raises the ceiling on what an individual doctor can send. The roles emerging from this shift reflect a clear division between those who employ agents and those who build them.</p>
<ul>
<li><strong>AI system designers</strong> Determine agent behavior, define evaluation criteria, and oversee multi-agent pipelines by combining deep data analytics knowledge with systems thinking.
</li>
<li><strong>AgentOps engineers</strong> represent a specialized evolution of machine learning operations (MLops) focused on implementing, tracking, and monitoring autonomous workflows in manufacturing, where failure modes are much less predictable than in classic machine learning.
</li>
<li><strong>Developers specialized in domains</strong> they occupy the most defensible niche: a data scientist with deep expertise in finance or healthcare who builds agent pipelines for their specific industry. It&#8217;s a combination that&#8217;s demanding to replicate.
</li>
</ul>
<h2><span># </span>Keeping the pace</h2>
<p>For practitioners still catching up, the practical starting point is deliberately modest. Don&#8217;t try to automate all your work tomorrow.</p>
<p>Start with a single-agent system using smolagents or LangGraph. Give it access to two tools appropriate for a task you&#8217;re already doing manually, and run them on a problem where you know the expected outcome. Rate it honestly. Once it&#8217;s working reliably, bring in a second agent for a different specialty. Set up logging, define success criteria, and run systematic testing.</p>
<p>The data scientists who will thrive here are those who employ these tools to build the practical intuition and develop the evaluative thinking required to responsibly implement autonomous systems. The only way to keep up is to participate in building it.</p>
<p><strong><strong><a href="https://www.linkedin.com/in/vc1401/" target="_blank" rel="noopener noreferrer">Vinod Chugani</a></strong></strong>    is an artificial intelligence and data science educator who bridges the gap between emerging artificial intelligence technologies and practical applications for working professionals. His areas of interest include agentic artificial intelligence, machine learning applications, and workflow automation. Through his work as a technical mentor and instructor, Vinod has supported data professionals in skill development and career transitions. He brings analytical knowledge of quantitative finance to his hands-on teaching approach. Its content emphasizes practical strategies and frameworks that professionals can implement immediately.</p>
</p></div>
<p>The post <a href="https://aisckool.com/what-the-age-of-the-agent-means-for-data-science/">What the age of the agent means for data science</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/what-the-age-of-the-agent-means-for-data-science/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i0.wp.com/www.kdnuggets.com/wp-content/uploads/kdn-what-the-agentic-era-means-for-data-science.png?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>OpenAI and Anthropic Letter on the Prevention of AI-Developed Biological Weapons</title>
		<link>https://aisckool.com/openai-and-anthropic-letter-on-the-prevention-of-ai-developed-biological-weapons/</link>
					<comments>https://aisckool.com/openai-and-anthropic-letter-on-the-prevention-of-ai-developed-biological-weapons/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Thu, 04 Jun 2026 15:35:45 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27321</guid>

					<description><![CDATA[<p>CEOs of companies several major artificial intelligence companies are pushing for members of Congress to pass fresh legislation that would make it harder for bad actors to develop biological weapons using their technology. Demis Hassabis of Google DeepMind, Sam Altman of OpenAI, Dario Amodei of Anthropic and Mustafa Suleyman of Microsoft AI were among the [&#8230;]</p>
<p>The post <a href="https://aisckool.com/openai-and-anthropic-letter-on-the-prevention-of-ai-developed-biological-weapons/">OpenAI and Anthropic Letter on the Prevention of AI-Developed Biological Weapons</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div>
<p><span class="lead-in-text-callout">CEOs of companies</span> several major artificial intelligence companies are pushing for members of Congress to pass fresh legislation that would make it harder for bad actors to develop biological weapons using their technology.</p>
<p class="paywall">Demis Hassabis of Google DeepMind, Sam Altman of OpenAI, Dario Amodei of Anthropic and Mustafa Suleyman of Microsoft AI were among the signatories of the agreement <a data-offer-url="https://screendna.org/" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://screendna.org/&quot;}" href="https://screendna.org/" rel="nofollow noopener" target="_blank">public writing</a> calling for legislation requiring companies selling synthetic DNA and RNA to screen customers and to prevent the misuse of genetic material.</p>
<p class="paywall">The letter, organized by the nonpartisan Institute for Progress and the right-wing Foundation for American Innovation, acknowledged that given the pace of artificial intelligence development, &#8220;there is a real possibility that the knowledge barriers that have historically prevented bad actors from obtaining biological weapons will significantly erode.&#8221;</p>
<p class="paywall">Scientist Arthur Kornberg was the first to successfully synthesize DNA in the 1950s. Today, the process is automated, with dozens of companies around the world using commercial synthesizers to &#8220;print&#8221; and sell custom genetic sequences used in scientific research, drug development and diagnostics. Many suppliers sell only to qualified researchers, biotechnology companies and educational institutions, but not all of them check customers or the gene sequences they order.</p>
<p class="paywall">In 2017, Canadian researchers raised the alarm when they used $100,000 worth of mail-order DNA to recreate the extinct horsepox virus. Critics say the same methodology could be used to engineer smallpox, a closely related and deadly virus. Since then, gene synthesis has only become cheaper.</p>
<p class="paywall">Combined with advances in artificial intelligence, it is now possible to design unsafe fresh toxins and pathogens using huge language models, although some biology training would likely still be needed to create a functional virus from scratch. Although bioterrorist attacks have been infrequent, they can cause mass casualties, social panic, and economic losses. A solemn concern is that an AI-designed pathogen could intentionally or unintentionally cause a global pandemic.</p>
<p class="paywall">“AI tools enable the user to very quickly determine where to turn to order sequences that will not be subject to scrutiny,” says David Relman, a microbiologist and biosecurity expert at Stanford University who signed the letter. “If prompted properly, they can also tell you how to change the nature of the order so that even people checking it will have a much harder time detecting what you are trying to prepare.”</p>
<p class="paywall">Signatories include other scientists, national security experts, and executives from gene synthesis companies Twist Bioscience and Ansa Biotechnologies. These companies are members of the International Gene Synthesis Consortium, established in 2009 to implement voluntary screening practices. Many companies already utilize software to screen orders for &#8220;sequences of concern&#8221; that may contribute to body toxicity or disease.</p>
<p class="paywall">“If you have technology that can synthesize DNA, you want to make sure it&#8217;s used responsibly, and part of that is making sure you understand what you&#8217;re doing and who you&#8217;re doing it for,” says James Diggans, vice president of policy and biosecurity at Twist Bioscience. The company has been supporting the implementation of formal rules for years.</p>
<p class="paywall">Federal <a href="https://aspr.hhs.gov/S3/Pages/OSTP-Framework-for-Nucleic-Acid-Synthesis-Screening.aspx" class="text link" target="_blank" rel="noopener">guidelines</a> introduced under the Biden administration required scientists and companies receiving federal funds to order synthetic gene sequences from suppliers that monitor purchases. AND <a href="https://www.cotton.senate.gov/news/press-releases/cotton-klobuchar-introduce-bill-to-establish-federal-biotech-security-framework" class="text link" target="_blank" rel="noopener">bipartisan bill</a> introduced earlier this year in the Senate would require all gene synthesis providers operating in the U.S. to screen orders and customers for bad actors or unsafe pathogens.</p>
<p class="paywall">However, screening tools are not perfect. Last year, Microsoft researchers published a report entitled <a href="https://www.science.org/doi/10.1126/science.adu8578" class="text link" target="_blank" rel="noopener">test</a> showing that AI protein design tools were able to generate potentially unsafe gene sequences that slipped through the companies&#8217; screening software. The models suggested fresh protein sequences with structures similar to those known to be unsafe.</p>
<p class="paywall">Geoff Ralston, former president of Y Combinator and partner at the Unthreatening AI Fund, believes that AI labs with biological models should self-screen users.</p>
<p class="paywall">“It should be very difficult, if not impossible, to ask a model to help you do something immediately dangerous,” says Ralston, who also signed the letter.</p>
<p class="paywall">Relman agrees that regulating audit procedures is only part of the solution. “Given that in some cases the inspection may fail, we need to have other inspection points,” he says. “This is where AI companies will need to step up.”</p>
</div>
<p>The post <a href="https://aisckool.com/openai-and-anthropic-letter-on-the-prevention-of-ai-developed-biological-weapons/">OpenAI and Anthropic Letter on the Prevention of AI-Developed Biological Weapons</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/openai-and-anthropic-letter-on-the-prevention-of-ai-developed-biological-weapons/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i3.wp.com/media.wired.com/photos/6a1f6447234d4b89dad80277/191:100/w_1280,c_limit/science_anthropic_final.jpg?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>5 Humorous Articles That Clearly Explain LLM</title>
		<link>https://aisckool.com/5-humorous-articles-that-clearly-explain-llm/</link>
					<comments>https://aisckool.com/5-humorous-articles-that-clearly-explain-llm/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Thu, 04 Jun 2026 06:34:44 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27309</guid>

					<description><![CDATA[<p># Entry # 1. Attention is all you need This is Attention is all you need document he introduced Transformer architecturewhich is the basis of up-to-date LLM. Before Transformers, many language models used recursive or convolutional architectures to process sequences. This paper showed that attention alone can be enough to build a powerful sequence model. [&#8230;]</p>
<p>The post <a href="https://aisckool.com/5-humorous-articles-that-clearly-explain-llm/">5 Humorous Articles That Clearly Explain LLM</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div id="post-">
<p> </p>
<h2><span># </span>Entry</h2>
</p>
<h2><span># </span>1. Attention is all you need</h2>
<p>This is <strong><a href="https://arxiv.org/abs/1706.03762" target="_blank" rel="noopener">Attention is all you need</a></strong>    document he introduced <strong>Transformer architecture</strong>which is the basis of up-to-date LLM. Before Transformers, many language models used recursive or convolutional architectures to process sequences. This paper showed that attention alone can be enough to build a powerful sequence model. The most essential concept in this article is self-mindfulness. Self-attention allows each token in the sequence to look at the other tokens and decide which ones are most essential. This is one of the reasons why LLMs can understand the context of long sentences and paragraphs. The paper also presents multi-head attention, positional coding and the general structure of the transformer block. This is essential because almost all up-to-date LLMs – including the GPT, Llama, Claude, Gemini and Qwen models – are based on the idea of ​​the Transformer.</p>
</p>
<h2><span># </span>2. Language models are few students</h2>
<p>This is <strong><a href="https://arxiv.org/abs/2005.14165" target="_blank" rel="noopener">GPT-3 paper</a></strong>. It explains one of the biggest changes in natural language processing (NLP): instead of training a separate model for each task, a enormous language model can perform multiple tasks simply by reading instructions and examples on the command line. This paper presents GPT-3, an autoregressive language model with 175 billion parameters trained to predict the next token. The most engaging part is not only the size of the model, but <strong>the idea of ​​contextual learning</strong>. The model may see several examples in the prompt and then continue the pattern without updating its weights. This article is essential because it explains why nudgings have become so powerful. It helps you understand why LLMs can answer questions, summarize text, translate, write code and follow examples without having to be retrained on each task.</p>
</p>
<h2><span># </span>3. Scaling laws of neural language models</h2>
<p>This <strong><a href="https://arxiv.org/abs/2001.08361" target="_blank" rel="noopener">Scaling laws for neural language models</a></strong>    The article tried to answer a practical question: <strong>what happens when we enlarge language models, train them on more data, and operate more computing power?</strong> Model performance has been shown to improve in a predictable manner as parameters, data, and computations augment. This article discusses the scaling side of up-to-date LLMs and explains why the field has moved towards larger models and larger training runs. This is essential because it provides the system-level logic behind up-to-date LLM training. It helps explain why companies invest so heavily in larger models, larger data sets, and massive compute clusters. It also provides a useful framework for understanding more recent discussions about computationally optimal learning, data quality, and effective model scaling.</p>
</p>
<h2><span># </span>4. Training language models to execute instructions based on human feedback</h2>
<p>This is <strong><a href="https://arxiv.org/abs/2203.02155" target="_blank" rel="noopener">Instruct GPT paper</a></strong>. Explains how the base language model becomes more useful as an assistant. A pre-trained model predicts text well, but that doesn&#8217;t automatically mean it will follow instructions, be helpful, or provide protected responses. The article uses the training process it covers <strong>Supervised tuning and reinforcement learning from human feedback (RLHF)</strong>. First, people write good sample answers. Humans then rank the model&#8217;s results. These rankings are used to train the reward model, and the language model is further optimized to produce the answers people prefer. This article is essential because it explains the difference between a raw language model and an instruction-executing assistant. If you want to understand why chat models behave differently than stock models, you should definitely read this.</p>
</p>
<h2><span># </span>5. Search-assisted generation for knowledge-intensive NLP tasks</h2>
<p>This <strong><a href="https://arxiv.org/abs/2005.11401" target="_blank" rel="noopener">Search-assisted generation for knowledge-intensive NLP tasks</a></strong>    This article explains Search Assisted Generation (RAG). The main idea is that a language model does not have to rely solely on the knowledge stored in its parameters. It can download relevant documents from an external source and operate them to generate better answers. This paper combines a pre-trained generation model with a dense retriever and document index. This allows the model to access external knowledge when generating responses. This is especially useful for answering questions, fact-based tasks, and situations where information changes over time. This article is essential because many real-world LLM applications operate some form of search. Chatbots, enterprise assistants, search systems, customer service agents, and documentation tools often operate RAG to base responses on specific sources.</p>
</p>
<h2><span># </span>Summary</h2>
<p>Together, these five articles give a good overview of how up-to-date LLMs work:</p>
<p><span><strong>Transformer architecture → pre-training → scaling → instruction tuning → search-assisted generation</strong></span> </p>
<p>Don&#8217;t worry if you don&#8217;t understand every equation or technical detail on your first reading. The goal is simply to understand the main idea behind each article and what it means. Once you do this, most LLM concepts will start to make a lot more sense.</p>
<p><b><a href="https://www.linkedin.com/in/kanwal-mehreen1/" rel="noopener" target="_blank"><strong><a href="https://www.linkedin.com/in/kanwal-mehreen1/" target="_blank" rel="noopener noreferrer">Kanwal Mehreen</a></strong></a></b>    is a machine learning engineer and technical writer with a deep passion for data science and the intersection of artificial intelligence and medicine. She is co-author of the e-book &#8220;Maximizing Productivity with ChatGPT&#8221;. As a 2022 Google Generation Scholar for APAC, she promotes diversity and academic excellence. She is also recognized as a Teradata Diversity in Tech Scholar, a Mitacs Globalink Research Scholar, and a Harvard WeCode Scholar. Kanwal is a staunch advocate for change and founded FEMCodes to empower women in STEM fields.</p>
</p></div>
<p>The post <a href="https://aisckool.com/5-humorous-articles-that-clearly-explain-llm/">5 Humorous Articles That Clearly Explain LLM</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/5-humorous-articles-that-clearly-explain-llm/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i2.wp.com/www.kdnuggets.com/wp-content/uploads/kdn-5-fun-papers-that-explain-llms-clearly.png?ssl=1" medium="image"></media:content>
            	</item>
		<item>
		<title>Data center operators are trying to solve water consumption problems</title>
		<link>https://aisckool.com/data-center-operators-are-trying-to-solve-water-consumption-problems/</link>
					<comments>https://aisckool.com/data-center-operators-are-trying-to-solve-water-consumption-problems/#respond</comments>
		
		<dc:creator><![CDATA[The AI Sckool]]></dc:creator>
		<pubDate>Wed, 03 Jun 2026 21:32:49 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<guid isPermaLink="false">https://aisckool.com/?p=27297</guid>

					<description><![CDATA[<p>SpaceX on Monday amended its initial public offering to state that water conditions – including water scarcity, water regulations and drought – could limit data center development. It&#8217;s not the only tech company trying to assess how water scarcity might impact its business. Water consumption is becoming one of the most controversial issues in data [&#8230;]</p>
<p>The post <a href="https://aisckool.com/data-center-operators-are-trying-to-solve-water-consumption-problems/">Data center operators are trying to solve water consumption problems</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p></p>
<div>
<p><span class="lead-in-text-callout">SpaceX on Monday</span> amended its initial public offering to state that water conditions – including water scarcity, water regulations and drought – could limit data center development.</p>
<p class="paywall">It&#8217;s not the only tech company trying to assess how water scarcity might impact its business. Water consumption is becoming one of the most controversial issues in data centers. Recent Gallup <a data-offer-url="https://news.gallup.com/poll/709772/americans-oppose-data-centers-area.aspx" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://news.gallup.com/poll/709772/americans-oppose-data-centers-area.aspx&quot;}" href="https://news.gallup.com/poll/709772/americans-oppose-data-centers-area.aspx" rel="nofollow noopener" target="_blank">vote</a> found that seven in ten Americans oppose data center development, and water scarcity is ranked as the top resource issue. In the face of increasingly fierce resistance, some tech companies are trying to assure the public that they must confront this problem.</p>
<p class="paywall">Data centers mainly employ water to frosty server racks, which produce enormous amounts of heat. One popular technique, known as evaporative cooling, uses fresh water to absorb heat, which is then pumped to cooling towers where it evaporates to the outside.</p>
<p class="paywall">Using more water could save money and reduce emissions for huge technology companies by reducing the power needed for cooling, which relies on energy-intensive pumps to recirculate water. But it can also come with a huge water footprint: for example, Google&#8217;s facility in Council Bluffs, Iowa, which uses evaporative cooling, <a data-offer-url="https://www.gstatic.com/gumdrop/sustainability/google-2025-environmental-report.pdf" class="external-link text link" data-event-click="{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https://www.gstatic.com/gumdrop/sustainability/google-2025-environmental-report.pdf&quot;}" href="https://www.gstatic.com/gumdrop/sustainability/google-2025-environmental-report.pdf" rel="nofollow noopener" target="_blank">digested</a> over 1 billion gallons in 2024</p>
<p class="paywall">Lawrence Berkeley National Laboratory <a href="https://escholarship.org/uc/item/32d6m0d1" class="text link" target="_blank" rel="noopener">predicted</a> a 2024 report found that hyperscale data centers could employ up to 33 billion gallons of water by 2030 if they relied heavily on evaporative cooling. That&#8217;s the same as, or even less than, other thirsty industries like agriculture and oil and gas <a href="https://www.usgs.gov/faqs/how-much-water-does-typical-hydraulically-fractured-well-require" class="text link" target="_blank" rel="noopener">well fragged</a> can employ 1.5 to 16 million gallons of water, but that poses a risk in regions where water is already limited. The risk is particularly acute in summer, when data center cooling demand surges at the same time as municipal water employ.</p>
<p class="paywall">“Water is a highly local and regional issue,” says Shaolei Ren, an engineering professor at the University of California, Riverside. “It&#8217;s a limited resource and we have to manage it very carefully.”</p>
<p class="paywall">Some tech giants, including Microsoft, OpenAI and Oracle, have released statements in recent months indicating they are moving away from evaporative cooling entirely to save water. This includes the massive expansion of OpenAI and Oracle Stargate in multiple states, including the water-stressed Texas region.</p>
<p class="paywall">These include commitments to replenish more freshwater than the company uses by investing in local water projects; increasing the scale of employ of recovered and recycled water; and disclosing annual water consumption in data centers. (Other tech companies, including Microsoft, are making similar promises about water replenishment and local investment. Google has been working on most of these promises for several years.) It also promises to employ a &#8220;data-driven framework&#8221; to decide which data center designs will work best in local catchments.</p>
<p class="paywall">Ben Townsend, global head of infrastructure and sustainability at Google, says data center design is much more complicated than simply abandoning one type of cooling across the board. It said the company has been conducting detailed hydrological assessments of its locations over the past four years to determine what type of cooling would be best.</p>
<p class="paywall">“In some regions there is little water, in others there is plenty of it,” he says. “A one-size-fits-all strategy just doesn&#8217;t work.”</p>
<p class="paywall">In April, Google <a href="https://ec.europa.eu/info/law/better-regulation/have-your-say/initiatives/16035-Energy-efficiency-rating-scheme-for-data-centres-in-Europe/F33395555_en" class="text link" target="_blank" rel="noopener">defended</a> Evaporative cooling for areas with so-called &#8220;abundance&#8221; of water reported to the European Union as necessary for the development of truly sustainable data centers. Google&#8217;s arguments dovetail with recent research by Ren and his team, who found that if all U.S. data centers used some type of evaporative cooling during peak demand, it could free up an additional 10 to 30 gigawatts of power. In areas where networks are stressed but water supplies are not, the employ of evaporative cooling can provide significant headroom for utilities trying to balance the load.</p>
</div>
<p>The post <a href="https://aisckool.com/data-center-operators-are-trying-to-solve-water-consumption-problems/">Data center operators are trying to solve water consumption problems</a> appeared first on <a href="https://aisckool.com">AI SCKOOL</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aisckool.com/data-center-operators-are-trying-to-solve-water-consumption-problems/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<media:content url="https://i3.wp.com/media.wired.com/photos/6a1f238eaaa9871d34a367b3/191:100/w_1280,c_limit/Data-Center-Water-Consumption-Science-2245861123.jpg?ssl=1" medium="image"></media:content>
            	</item>
	</channel>
</rss>
