AI Agents are becoming more and more popular in last few days, that makes it high demand and low supply in the field of making an independent agent which can browse web and take actions. To make one you need to have knowledge of web page scraping, LLMs and Machine learning.
These are some topics which are necessary in order to make your own agent, let me tell you which field will work for which thing.
- Web Scarping: Scraping is the high popular demanded skill which most of the software engineers need, By web Extraction we can get needed data with API or without an API by just scraping the HTML of the Page. By scraping we can collect more and more data without using API.
- LLM (Large Language Model): LLMs became famous after ChatGPT appeared in market, LLM is not a new thing it is a old technology which is used for training ML models, LLMs consists of different language rules for generating text and this is the reason why ChatGPT is known as GPT model (Generative Pretrained Model) because it was trained on Generative LLM.
- Machine Learning: It is a field of artificial intelligence which deals with statistical and mathematical calculation for training a ML model, A model which can learn from its experiences and predict the values.
Web Scraping Using Python
These are not just copied and pasted tricks these things I am telling you are from my own personal experience, and I learnt it from trial and error since 2019. In 2019 when I started taking interest in programming I started with the Python the so called most simplest programming language. I learnt python and then full stack web development and then when ChatGPT came in 2021-2022, It taken me to another level of learning.
I started learning about LLMs, Neural Networks and Learning of Machines and since then this journey never ended, So read the whole article, I guarantee you that you will memorize the basics of agents working and making, and I hope you will make your own agent which can web scrap and answer your asked question.
So, Now if you understood what are the thing which you will need to make your own agent, lets proceed further, Lets decide which programming language is best for making a agent, I personally recommend python because of its wide range of open source libraries which make our more than 50% work easy with its prebuilt libraries. So for webpage extraction you can use many libraries like selenium, BeautifulSoup, scrapy, Playwright, requests and lxml.
Simple scraping → BeautifulSoup + Requests
Large-scale projects → Scrapy
JavaScript-heavy pages → selenium
or Playwright
Fast & efficient parsing → lxml
These are Scraping technologies which you can use for Scraping using python, you can visit their sites for more details, I hope you understood about Scraping now lets move to LLMs, lets learn what is LLM and how can we use them with our own project of building agents.
LLM (Large Language Model)
LLM stand for Large Language Model which are simply just compilation of rules and NLP (Natural language processing) which are used for generating language with correct grammar rules like tenses, verbs etc. LLM became famous after ChatGPT came in life in 2022.
ChatGPT became most famous GPT model in the world because of huge data training model, so as you know ChatGPT also uses LLM for generating correct language within rules and regulations of grammar of a particular language.
There are many LLMs but some are open source because it is very difficult to store such a large amount of noiseless data for training your own model on it, So if you are starting to LLMs you can use open source LLM for beginning your agent making, You can try LangChain most famous open source large language model.
Now as I hope you know about LLM also, lets move further to it, machine learning is a statistical and mathematical way to train machine to replicate humans intelligence by itself. In Machine learning you can start with Linear Regression Model and Logistic Regression Model, Where linear regression model is a regression model which predicts value according to the inputs and return undefined labels, But Logistic Regression model is a classification model which classify input within the predefined labels.
Linear Regression Model for Training
You can use Linear regression model for simple tasks like House Pricing Automation, it is the most common practice model which is used for predicting house prices on parameters like windows, doors, rooms, halls, etc. You can use LangChain with Linear regression model to make your own agent.
If you want to learn more about linear regression you can check out my articles about machine learning and machine learning models.
Now lets see some code for preprocessing data and training model on it, we are going to use linear regression model for checking Mean Squared Error, Intercept and Weights in the data set, it will give you a model which can predict values based on input in numbers.
Code
import matplotlib.pyplot as plt # pip install matplotlib
import numpy as np # pip install numpy
# pip install scikit-learn for sklearn
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error
# Diabetes check linear regression model
diabetes = datasets.load_diabetes()
# diabetes.keys()
# Taking only some features
# Can achieve for whole features in the data set by removing [:, np.newaxis, 2] slicing.
diabetes_x = diabetes.data[:, np.newaxis, 2]
# print(diabetes_x)
diabetes_x_train = diabetes_x[:-30] # Taking first 30 for training
diabetes_x_test = diabetes_x[-30:] # keeping last 30 for testing model dependencies
diabetes_y_train = diabetes.target[:-30]
diabetes_y_test = diabetes.target[-30:]
model = linear_model.LinearRegression()
model.fit(diabetes_x_train, diabetes_y_train)
diabetes_y_predict = model.predict(diabetes_x_test)
print(f"Mean Squared Error: {mean_squared_error(diabetes_y_test, diabetes_y_predict)}")
print(f"Weights: {model.coef_}")
print(f"Intercept: {model.intercept_}")
plt.scatter(diabetes_x_test, diabetes_y_test)
plt.plot(diabetes_x_test, diabetes_y_predict)
plt.show()
This will give you a graph output which shows the best fit line which predict the values of the nest upcoming variables with consideration of past values and situations. This is the output graph that you will get after executing the program written above.
Output of Linear Regression Model

Here you can see in the terminal there is written Mean Squared Error, weights and intercept of the given linear regression model, you can know more about this model and whole code in my another article focues on what is linear regression model in machine learning.
Thanks For Reading My Article
I hope this article is helpful for you if yes then please share it with your friends and collogues and also leave a beautiful comment in the comment section below and explore more articles like this on my website dexteritycoder.com and Thanks for reading my article.
Pingback: Compiler Design Easiest Explanation - Dexteritycoder