<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Analysis &#8211; Howard Nguyen</title>
	<atom:link href="https://howardnguyen.com/category/data-analysis/feed/" rel="self" type="application/rss+xml" />
	<link>https://howardnguyen.com</link>
	<description>Ph.D. in Data Science</description>
	<lastBuildDate>Sun, 04 Aug 2024 16:00:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://howardnguyen.com/wp-content/uploads/2023/05/H-icon3-36x36.png</url>
	<title>Data Analysis &#8211; Howard Nguyen</title>
	<link>https://howardnguyen.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Google Colab vs. Jupyter vs. Visual Studio Code</title>
		<link>https://howardnguyen.com/google-colab-vs-jupyter-vs-visual-studio-code/</link>
					<comments>https://howardnguyen.com/google-colab-vs-jupyter-vs-visual-studio-code/#comments</comments>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Sun, 04 Aug 2024 16:00:12 +0000</pubDate>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[AutoRegression in Time Series]]></category>
		<category><![CDATA[Exploratory Data Analysis (EDA)]]></category>
		<guid isPermaLink="false">https://howardnguyen.com/?p=1248</guid>

					<description><![CDATA[The choice between Google Colab, Jupyter Notebook, and Visual Studio Code (VS Code) for running Python code depends on your specific needs and preferences.]]></description>
										<content:encoded><![CDATA[<p>Which one you should choose? The choice between Google Colab, Jupyter Notebook, and Visual Studio Code (VS Code) for running Python code depends on your specific needs and preferences. Here’s a detailed comparison to help you decide which one might be best for you: Advantages: Disadvantages: Advantages: Disadvantages: Advantages: Disadvantages: Each tool has its…</p>
<p><a href="https://howardnguyen.com/google-colab-vs-jupyter-vs-visual-studio-code/" rel="nofollow">Source</a></p>]]></content:encoded>
					
					<wfw:commentRss>https://howardnguyen.com/google-colab-vs-jupyter-vs-visual-studio-code/feed/</wfw:commentRss>
			<slash:comments>52</slash:comments>
		
		
			</item>
		<item>
		<title>What is regularization and why it is important?</title>
		<link>https://howardnguyen.com/what-is-regularization-and-why-it-is-important/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Sun, 30 Jun 2024 17:50:21 +0000</pubDate>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Exploratory Data Analysis]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Regularization]]></category>
		<category><![CDATA[Machine Learning Model]]></category>
		<guid isPermaLink="false">https://howardnguyen.com/?p=1134</guid>

					<description><![CDATA[]]></description>
										<content:encoded><![CDATA[<p>Why it is important? Regularization is a technique used in machine learning and statistics to prevent overfitting, which occurs when a model learns the noise in the training data instead of the actual underlying patterns. Regularization adds a penalty to the model’s complexity, discouraging it from fitting too closely to the training data. This helps improve the model’s generalization to new…</p>
<p><a href="https://howardnguyen.com/what-is-regularization-and-why-it-is-important/" rel="nofollow">Source</a></p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How do you handle missing data?</title>
		<link>https://howardnguyen.com/how-do-you-handle-missing-data/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Sun, 30 Jun 2024 17:18:53 +0000</pubDate>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Exploratory Data Analysis]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Imputation Methods]]></category>
		<guid isPermaLink="false">https://howardnguyen.com/?p=1124</guid>

					<description><![CDATA[]]></description>
										<content:encoded><![CDATA[<p>Here are how we handle Handling missing data is a crucial step in data preprocessing, as it can significantly affect the performance of machine learning models. Here are some common techniques to handle missing data: Using pandas and scikit-learn: import pandas as pd from sklearn.impute import SimpleImputer from sklearn.impute import KNNImputer # Sample data data…</p>
<p><a href="https://howardnguyen.com/how-do-you-handle-missing-data/" rel="nofollow">Source</a></p>]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
