
Photo by the author Canva
A powerful portfolio is often the difference between making it and a fracture. But what exactly makes the portfolio powerful? Numerous complicated projects? A clever design? Impressive data visualization? Yes and no. Although these are necessary elements to make the wallet great, they are so obvious that everyone knows that you can’t do without them.
However, many data scientists make mistakes trying to go beyond it. As a result, they conduct interviews with the portfolio, which nominally have everything, but they are not really so great.
# Frame
Here are the frames that will aid you avoid typical mistakes when building a great portfolio.

# Mistakes
Let’s now talk about the mistakes of building a wallet and how to avoid them with this frame.
// Error 1: Building projects you don’t care about
Many wallets give the impression that the projects are there to mark the box: Titanic Survival, Iris Dataset, Mnist Digits. You know – typical things. Not only will you be drowned in thousands of similar wallets, but also shows the lack of originality and interest in what you do. Autopilot projects.
Repair: Start with domains that interest you, e.g. sportsIN financesIN music. When you interest you, you’ll go deeper without even trying. If you are a sports fan, you can analyze the performance of the NBA shot or choose these nippy ideas for exercise projects. A music fan can model playback recommendations.
// Error 2: Operate of any data on your knees
Candidates often take the first spotless CSVs that they can find. The problem is that real data learning does not work this way.
Amendment: You should show that you know how to find real data, access them and transform it to obtain further stages of modeling. In your projects, exploit API interfaces (e.g. Twitter/X API), Open government data sets (e.g. Data.gov) and sources completed on the web (e.g. Amazing public data sets for github). Operate as many data sources as possible, evaluate the data, the merger in one set of data and prepare them for modeling.
// Error 3: Treatment of projects such as Kaggle competitions
Kaggle Competitions focus on optimizing a single record. It is great for exercise, but it does not cross it in the real world. The accuracy itself is not the goal. You will have to compromise between the technical aspects of your model and the actual impact of business or social.
Amendment: Even if you exploit common kaggle data sets, always offer a different angle and an edge of the problem to have business or social value. For example, don’t just classify false vs. real messages. Show which words, phrases or topics cause disinformation. Another example: do not predict just a departure.

Show how a 10% reduction can save $ 2 million in annual revenues.

// Error No. 4: Showing only models, not work flows
Many projects sounded like a Jupyter notebook sequence: importing libraries, followed by preliminary data processing, and then matching models – here is the accuracy. He is incomplete and monotonous. There is a lack of how you support different stages of the project and why you make certain decisions.
Amendment: Make them to the end to the end of projects. Show each stage, from collecting data to implementation and everything in between. Explain why you made key choices, e.g. why you chose one model over the other or why you designed a specific function. Operate tools such as TastyIN FlaskOr Power BI Navigation desktops for others for exploit. All this will make your projects look like the problems used (e.g. Portfolio Arch Desai), not a code review (e.g. this).
// Error 5: End with a model, not action
Data scientists often end at the technical level, e.g. showing the result of accuracy. Ok, but what are you doing with it? You must remember that the practical application of the model is crucial. The technical aspect of the model is only one part of this, and the other is a business or social impact.
Repair: End the project with a recommendation of what to do. For example, “this model suggests setting control priorities in restaurants serving high -risk kitchen in winter.”
# Example of project: forecasting energy demand in the city to reduce costs
In this section I will create a trial project review to show how the frames can be used in practice.
Domain: The domain I have chosen is energy consumption and sustainable development. Living in a huge city made me realize how cities around the world are fighting the high demand for electricity during rush hour. Forecasting demand can aid you balance the network, reduce costs and reduce emissions more accurately.
Data: the main source can be US Energy Information Administration (OIA). In addition, I could exploit API Noaa Weather (e.g. for temperature and humidity) and Christmas calendars/events (for demand spines).
Calving the problem: instead of framing the problem as “predict the demand for electricity in time.”, I will do it as “how much money could save the city if it moved the peak loads using better demand forecasts?” Thanks to this, I transform the problem with technical forecasting into the problem of resource allocation and cost savings.
Building end-to-end: the project would include these stages.
- Data cleaning: support the missing hours, align the signs of time, normalize weather variables.
- Cech engineering:
- Delay features: demand in previous hours/days
- Weather features: temperature, humidity
- Calendar functions: weekday, holiday flag, main events
- Modeling:
- Implementation: For example, I could create a navigation desktop showing a 24-hour forecast compared to actual demand and simulate “what if”, eg, adjusting demand by shifting industrial loads.
Action: We will not stop at “low rmse forecasts”. Instead, let’s give a recommendation that has a business and social influence, e.g. “if the city has encouraged large companies to change 5% consumption away from peak hours (expected by the model), it can save USD 3.5 million annually network costs.”
# Bonus: Resources
As a bonus, here are some suggestions on what platforms you can exploit for exercise and where to find data.
// Exercise platforms
// Open data sources
// API for real -time data
# Application
You probably noticed that none of these errors is technical. This is not accidental; The biggest mistake is to forget that the portfolio is a demonstration of how you solve problems.
Focus on these two aspects of the dismonist and problem solving-and your portfolio will finally look like proof that you can do the task.
Nate Rosidi He is a scientist of data and in the product strategy. He is also an analytical teacher and the founder of Stratascratch, platforms aid scientists to prepare for interviews with real questions from the highest companies. Nate writes about the latest trends on the career market, gives intelligence advice, divides data projects and includes everything SQL.
