Wednesday, March 11, 2026

Here’s how I built MCP to automate my work on data

Share


Image through an ideogram

Most of my days as a data scientist look like this:

  • Stakeholders: “Can you tell us how much we earned in advertising revenues in the last month and what percentage of this was due to search ads?”
  • I: “Run the SQL inquiry to separate the data and transfer it.”
  • Stakeholders: “I understand. What is our revenue forecast for the next 3 years?”
  • I: “Consolidate data from many sources, talk to the financial team and build a model that forecasts revenues.”

Tasks such as the above are ad hoc requests from business stakeholders. They take about 3-5 hours and are usually unrelated to the basic project I am working on.

When such questions related to data appear, they often require me to exceed the dates of current projects or additional hours to perform the task. And here AI comes in.

Once AI models such as Chatgpt AND Claude They have been made available, the team’s performance improved, just like my ability to respond to ad hoc stakeholders. AI dramatically shortened the time spent writing code, generating SQL inquiries, and even cooperation with various teams to obtain the required information. In addition, after AI code assistants, such as Cursor They have been integrated with our code databases, the enhance in performance has improved even more. Tasks like the one I just explained above can now be done twice as faster than before.

Recently, when MCP servers began to gain popularity, I thought:

Can I continue to build MCP, which automates these flows of data learning?

I spent two days building this MCP server, and this article will break down:

  • Results and how much time I wrote with my MCP data learning
  • Reference resources and materials used to create MCP
  • Basic configuration, API interfaces and services that I integrated with work flow

# Building MCP data science

If you don’t know what MCP is yet, it means the contextual protocol of the model and is a frame that allows you to connect a immense language model with external services.
This movie It is a great introduction to MCPS.

// Basic problem

The problem I wanted to solve with my novel MCP science was:

How to consolidate information that is dispersed in various sources and generate results that can be directly used by interested parties and team members?

To achieve this, I built MCP with three components, as shown in the block diagram below:

Date science mcp scheme
Photo by the author Siren

// Component 1: Integration of the question bank

As a knowledge base for my MCP, I used the bank’s question bank (which contained questions, an example of the question to answer the question and some context about the tables).

When the stakeholder asks me this question:

What percentage of advertising revenues comes from search ads?

I don’t have to browse many tables and column names anymore to generate an inquiry. Instead, MCP searches the Bank inquias a similar question. Then he gains the context about the appropriate tables, which he should ask and adapts these questions to my specific question. All I have to do is call the MCP server, paste my stakeholder’s request, and in a few minutes I receive the appropriate inquiry.

// Component 2: Google’s integration

Product documentation is usually stored on Google – whether it is deck, document or spreadsheet.

I connected my MCP server with the team’s Google disk so that it has access to all our documentation in dozens of projects. This helps quickly separate data and answer questions such as:

Can you tell us how much we earned in advertising revenues last month?

I also indexed these documents to separate specific keywords and titles, so MCP just must view the list of keywords based on the query, and not access hundreds of pages at the same time.

For example, if someone asks a question related to “mobile video ads”, MCP first searches the document index to identify the most appropriate files before reviewing them.

// Component 3: Access to local documents

This is the simplest element of MCP in which I have a local folder that searches MCP. If necessary, I will add or delete files, allowing me to add my own context, information and instructions in addition to my team’s projects.

# Summary: How does my data study work

Here is an example of how my MCP is currently working to respond to ad hoc data demands:

  • The question arises: “How many advertising impressions did we served in the first quarter and how much advertising demand in relation to the supply?”
  • MCP Document downloads searches for our project folder for “Q3”, “Video”, “AD”, “demand” and “delivery” and finds appropriate project documents
  • Then he downloads specific details about the Q3 advertising campaign, its supply and demand from team documents
  • Searches the Bank inquiry similar questions about advertising
  • Uses the context obtained from the Document Bank and inquiries to generate the SQL query about the Q3 video campaign
  • Finally, the question
  • Then I collect the results, review them and send them to my stakeholders

# Details of the implementation

Here’s how I implemented this MCP:

// Step 1: Cursor installation

I used the cursor as my MCP client. You can install the cursor with this link. Basically, it is an AI code editor that can access your code base and operate it to generate or modify the code.

// Step 2: Google’s certificates

Almost all documents used by this MCP (including a question bank) were stored on Google.

To provide MCP access to Google, sheets and documents, you need to configure API access:

  1. Go to Google Cloud Console and create a novel project.
  2. Turn on the following API interfaces: Google disk, Google sheets, Google documents.
  3. Create certificates (Customer ID Oauth 2.0) and save them in a file called credentials.json.

// Step 3: Configure FastMCP

Fastmcp This is Open Source Python Framework used to build MCP servers. I followed her this tutorial To build my first MCP server using FastMCP.

(Note: This tutorial uses Claude’s computer as a MCP client, but the steps apply to the cursor or any selected AI code editor.)

Thanks to FastMCP, you can create a MCP server using Google Integration (sample code fragment below):

@mcp.tool()
def search_team_docs(query: str) -> str:
    """Search team documents in Google Drive"""
    drive_service, _ = get_google_services()
    # Your search logic here
    return f"Searching for: {query}"

// Step 4: Configure MCP

After building MCP, you can configure it in the cursor. This can be done by going to the cursor settings window → functions → model context protocol. Here you will see a section where you can add the MCP server. After clicking it, the file is called mcp.json Opening, where you can attach a configuration of the novel MCP server.

This is an example of what your configuration should look like:

{
  "mcpServers": {
    "team-data-assistant": {
      "command": "python",
      "args": ["path/to/team_data_server.py"],
      "env": {
        "GOOGLE_APPLICATION_CREDENTIALS": "path/to/credentials.json"
      }
    }
  }
}

After saving the changes to the JSON file, you can turn on this MCP and start using it in the cursor.

# Final thoughts

This MCP server was a uncomplicated side project that I decided to build to save time on my flows of personal data sciences. This is not a breakthrough, but this tool solves my immediate pain point: spending hours responding to ad hoc data demands, which receive the basic projects I am working on. I believe that such a tool simply outlines the surface of what is possible thanks to generative artificial intelligence and represents a wider change in the scope of performing scientific work.

The classic flow of work in learning data leaves:

  • Spending hours finding data
  • Code writing
  • Construction models

The emphasis is on practical technical work, and data scientists are expected to look at a larger picture and solve business problems. In some cases, we expect that we will supervise product decisions and enter as a product or project manager.

As AI evolutions, I think the boundaries between technical roles will be blurred. It will remain critical to understand the business context, ask the right questions, interpret the results and transfer insights. If you are a scientist of data (or aspiring), there is no doubt that AI will change the way of working.

You have two options: you can accept AI tools and build solutions that shape this change for your team, or let others build them for you.

Natassha Selvaraj He is a scientist of self -taught with passion for writing. Natassha writes about everything related to data, a true master of all data topics. You can connect with it LinkedIn or check it YouTube channel.

Latest Posts

More News