Skip to main content

introduction to streamlit using python: create data applications

Introduction:

   Photo by Luke Chesser on Unsplash

If you are a data scientist and not an expert in web frameworks; many times you must have felt that you would like to have a program which would help you magically transform your data science application into a interactive data application. For what seemed like an eternity; there was no tool and we all had to create dashboards and what not to suffice for an interactive data app. But now the wait is over; as Streamlit is here. 

what is streamlit?

streamlit; as their official website tells, is the "fastest way to build and share data apps." Now as you are getting really excited; let me give you some more good news. Yes, you can use streamlit just as a python library because instead of being an app running software; streamlit data apps can be created just using the streamlit library. The apps run from your terminal on saying "streamlit run script_name.py" where the script_name.py is just a normal python script where you code the app. 

what is this article going to discuss?

This article is the first post of the streamlit series I am going to write about the different uses, api docs, functionality and deployment experiments around streamlit. In this specific article; I will create 2 very easy streamlit data driven applications; both of which run on terminal. 

If you want to follow the code and understand hands on; then please download the codes from my github repo for streamlit experiments. Now, let's dive in. 

First! Install streamlit:

Installing streamlit is as easy as installing any python package. Write pip3 install streamlit or pip install streamlit as par your pip configuration and streamlit will install. 

Once the installation completes; check whether it has installed correctly. For this, write $streamlit hello in your terminal. This should open to a demo application from streamlit. Before that there will be some formalities like filling email addresses and accepting streamlit privacy agreement thingy. In short, your terminal will look like the following once you write the command and then fill the formalities:


Now, it may happen too that you went to download it; streamlit downloaded perfectly; but when you write streamlit hello; streamlit hello is not working. Most probably it will say something like:

AttributeError: module 'google.protobuf.descriptor' has no attribute '_internal_create_key'

The solution in this scenario is to upgrade your protobuf package. Write            pip3 install --upgrade protobuf and let the protobuf package be upgraded. This should solve your issue.

Now that we are all set with streamlit installation [ this error solution is linux specific]; we will proceed to explore how to write streamlit apps.

3 basic streamlit api you need to know to write your apps:

Streamlit has a lot of APIs to support different representations and much can be written about each types and considerations related to them; but we will get started with 3 basic APIs which are very strong to start writing the apps. These are:

(1) st.title(): 

this api is used to create a title element in the data app. The title comes in bold and in the top of the app page. The title takes a string argument which it represents in the title position. This is a useful method to give a proper heading to your app.

(2) st.write():

this api is called the swiss army knife of streamlit. If you don't know the variety of apis to write specific different things; then also you can use the st.write to print or show or display elements in your app using this method.

(3) st.subheader():

this api is useful for creating different sections in a very naive one-linear layout of an app. This creates big and bold texts out of its argument and displays them properly; hence helps in creating sections in your app.

Now, you will see how we apply these to write our first test app. First, open your script; and write the following line to import streamlit:

import streamlit as st

Now we will write all these different versions of commands to try out what works in st.write() , st.subheader() and st.title() and what doesn't.


Let's go through each of the lines and let's understand.

1. st.write("\frac{1}{2} does it support lateX?") 

In this line we check whether latex works out in streamlit or not. \frac{1}{2} is a standard representation of fraction 1/2 in latex; but once we look into the output we will see that it doesn't.


2. st.write("a normal string") 

In this line we simply check that whether string output can be done. This string finely prints out in the app.


3. st.write("<b>food</b> is great here; checking if simple html works",
         unsafe_allow_html = True)

This third line we write to check whether html can be written. And the answer with a big awe is yes. st.write() is equipped to take html codes and print it out. Although at the same time the streamlit guide properly warns that they are working to create better api for solely writing html content via that; so the unsafe_allow_html parameter which allows us to write html; will be deprecated once the html api is up, and running.


4. st.write(1234) 

we wrote this line to check if simple integer can be output also; and again, being the jack of all prints; write method simply prints out the number.


5.st.write(pd.DataFrame({"first_column":[10,20,30,40],
                      "second_column":[105,405,905,1605]}))

To much of my wonder; st.write() even prints out a dataframe object in a table fashion in the app. Although it is a common sense thing to not print out very big dataframes using this; as that will just destroy the aspect of the tool.


6.st.header("it shows chart too")
In this header() api; which is much similar to the subheader api; we have created a header before showing a chart. header is again; a very simple api; taking a string argument and creating h1 tag out of it.


7.

df = pd.DataFrame(np.random.randn(200, 3),
                  columns=['a', 'b', 'c'])

c = alt.Chart(df).mark_circle().encode(x='a', y='b',
                                       size='c', color='c',
                                       tooltip=['a', 'b', 'c'])
st.write(c)

In the above lines; we create a random dataframe; using alt.chart we create a chart pointing out the datapoints in a 2d space and then finally; print the chart object out using st.write. 

This shows that write can even print out charts too. This is a very releaving fact that you can actually use write() to print text; show dataframe and even show small charts/images using it. As useful this is for a beginner to start writing small apps within a hour; you need to use finer apis optimized for showing data, media and charts to actually show them efficiently all over your apps. We will focus on these various APIs in our second post on streamlit.

Now, you can write some more of these lines; as I have done in the first test app; and then you can save the script. 

To actually run your script; you will have to go to the terminal and write:

$streamlit run my_test_app.py

Once you press the enter; a local url and a network url opens. We will see what these means once we talk about deploying the apps in other places; i.e. effectively hosting the apps; but now; once the urls pop up; your default browser will also open up and the app with local host address on its page title will start running.

Your terminal will look something like this:

And my app looks something like this when it opens up the google chrome and starts running.

 

You have to press ctrl+c anytime to stop your app. Note even on closing the browser page; the app doesn't close. 

I hope you enjoyed writing your first static data app using streamlit. In the next section of this post, tomorrow I will upload about how I have created a fully interactive spacy-run NLP app using streamlit. While you will not need much spacy knowledge to recreate the data app as that code is already in the aforementioned repo; you can always read about spacy from my spacy series

Spacy-run data app description:

Now, in the second section we will see how to create interactive apps. Interactive application means an app which lets the client interact with the different components of app via writing, giving inputs or interacting even with AR/VR. 

In case of streamlit, there is a large variety of interactive data input options; of which we will explore a few, like buttons, radio buttons, check boxes and text inputs to create our fully interactive spacy app. Let's go through each interactive part's code and explain it.

First let's inspect button api. st.button() has a specific parameter named label; which you will often see in many of the interactive widget. Via label; you can provide a string which will show up as the name of the widget. st.button() returns a boolean value; i.e. if the button is clicked in the app; then it returns True otherwise False.

Now, consider the following implementation of a button in the app.

if st.button("choose big model"):
    nlp = spacy.load("en_core_web_lg")
else:
    nlp = spacy.load("en_core_web_sm")


In the above conditional, we ask the user to choose what model to choose. If the user clicks the button, then button replies yes; hence we end up choosing the model en_core_web_lg, which is the bigger model. In case the user leaves the button; then we get st.button() reply as False; therefore we choose the else option, hence smaller  model en_core_web_sm. 

If you don't know what are these models in spacy; you can read about the different models here.

Now, moving on, we will see how to implement a checkbox using st.checkbox() api. st.checkbox is a similar api like button, with a label parameter and a bool return value. But, in UI, it shows a box to check along with the label by the box. Let's see how we have used a checkbox in the app.

st.subheader("trying out checkbox")
check = st.checkbox("will you let your data get stored?")

if check:
    print("fake notion! nothing recording now!")
else:
    pass

clearly, it is the similar implementation with that of button; but the difference shows in the ui representation. In this example, we create the checkbox with label "will you let your data get stored?" and then act accordingly. 

One point is important to note here that if you write any print; and then run the app from terminal then the print will produce in terminal window. To print in app screen; you again will have to use st.write() command.

Now, we will explore radio buttons. Radio buttons are for multi choice options; but are buttons. In case of a radio button, the syntax is st.radio(label, options); in which options are the options from which one gets to choose; and the label is what says at the top of all the options. Unlike normal buttons and checkbox, radio returns the option chosen. I implemented one radio button in my spacy app to choose which action the user wants to perform on the text they would enter next. Let's see how I did that:

st.subheader("radio button")
radio = st.radio("choose your action",options = ["entity checking",
                                                 "dependency tree showcase",
                                                 "pos tagging"])
if radio == "entity checking":
    print("entity is chosen")
elif radio == "dependency tree showcase":
    print("dependency tree visualization is done")
elif radio == "pos tagging":
    print("pos tagging will be shown")
else:
    pass


you see! the implementation is as simple as calling the api and filling in the parameters. In this case again, I print out the returned value from st.radio() and store it in the radio named variable to use it later. We will show at the end of this discussion how I use the radio variable to perform the user-chosen task. But for now, lets see our final interactive component; the text input.

For text input; there is an attribute called st.text_input(); which has specifically two parameters. One is ofcourse the label and the other one is a default string which is there in the textbox from begining. So the syntax is basically

st.text_input(label, example_string_in_box)

In the ui of the application, you will see that the label will come at the top; then there will be the textbox; with the example_string in it. You can just delete that string and write your own text; and press enter to give the input.


 

st.text_input obviously returns the text which is written in the text box; and if user leaves it then it returns the example_string which we put earlier. This one is particularly easy to implement, but as you understand probably, that it is a very powerful tool. see below how we implemented it:

st.subheader("write the text")
text = st.text_input("text box","Example: write here")

The rest of the app is just processing the text which we took in and performing one of the three different options we give to our user to perform on the text. We print out the results using st.write() just like in test_app.

We take in the text; process it using the already loaded model during model loading button; then finally use different spacy functionality to solve the tasks. Follow the github code to understand or rewrite your own version of the same app. I will leave you with some final picture of how the app looks; as you have understood most of the front end commands by now.

The spacy text-processing app

Thanks for reading! I will create some more complex applications next week and discuss the real challenges to host it in a cloud environment. Till then, stay tuned! 

Browse the index for more posts which you may like to read.

Comments

Popular posts from this blog

Mastering SQL for Data Science: Top SQL Interview Questions by Experience Level

Introduction: SQL (Structured Query Language) is a cornerstone of data manipulation and querying in data science. SQL technical rounds are designed to assess a candidate’s ability to work with databases, retrieve, and manipulate data efficiently. This guide provides a comprehensive list of SQL interview questions segmented by experience level—beginner, intermediate, and experienced. For each level, you'll find key questions designed to evaluate the candidate’s proficiency in SQL and their ability to solve data-related problems. The difficulty increases as the experience level rises, and the final section will guide you on how to prepare effectively for these rounds. Beginner (0-2 Years of Experience) At this stage, candidates are expected to know the basics of SQL, common commands, and elementary data manipulation. What is SQL? Explain its importance in data science. Hint: Think about querying, relational databases, and data manipulation. What is the difference between WHERE ...

Spacy errors and their solutions

 Introduction: There are a bunch of errors in spacy, which never makes sense until you get to the depth of it. In this post, we will analyze the attribute error E046 and why it occurs. (1) AttributeError: [E046] Can't retrieve unregistered extension attribute 'tag_name'. Did you forget to call the set_extension method? Let's first understand what the error means on superficial level. There is a tag_name extension in your code. i.e. from a doc object, probably you are calling doc._.tag_name. But spacy suggests to you that probably you forgot to call the set_extension method. So what to do from here? The problem in hand is that your extension is not created where it should have been created. Now in general this means that your pipeline is incorrect at some level.  So how should you solve it? Look into the pipeline of your spacy language object. Chances are that the pipeline component which creates the extension is not included in the pipeline. To check the pipe eleme...

What is Bort?

 Introduction: Bort, is the new and more optimized version of BERT; which came out this october from amazon science. I came to know about it today while parsing amazon science's news on facebook about bort. So Bort is the newest addition to the long list of great LM models with extra-ordinary achievements.  Why is Bort important? Bort, is a model of 5.5% effective and 16% total size of the original BERT model; and is 20x faster than BERT, while being able to surpass the BERT model in 20 out of 23 tasks; to quote the abstract of the paper,  ' it obtains performance improvements of between 0 . 3% and 31%, absolute, with respect to BERT-large, on multiple public natural language understanding (NLU) benchmarks. ' So what made this achievement possible? The main idea behind creation of Bort is to go beyond the shallow depth of weight pruning, connection deletion or merely factoring the NN into different matrix factorizations and thus distilling it. While methods like know...