Introduction to Material Properties

Explore how to extract and organize material properties of aluminum alloys from websites using Python. Learn HTML basics, use BeautifulSoup to parse content, and automate data export to Excel with Openpyxl.

We'll cover the following...

- Material properties
- About HTML

Material properties

You used Requests earlier to scrape a website and do something with the results. That website functioned more like an API because you knew exactly what format the website would return. Having the data nicely formatted does not always happen. Sometimes, you have to find and format, clean, or otherwise manipulate the data ourselves. When you scrape a real website, you will need to be able to parse through HyperText Markup Language (HTML) to get the information that you want. On any webpage that you visit, HTML and its sister Cascading Style Sheets (CSS) are hard at work to display the design of the particular webpage. The Engineering Toolbox website [1] contains technical data, material properties, chemical properties, economic information, drawing tools, and so much more for all types of engineers. You will make a Python program to grab information related to the material properties of different aluminium alloys from the Engineering Toolbox website. The program will put this information into an Excel spreadsheet so you can reference it later. BeautifulSoup4 is the preferred library for HTML parsing. The documentation for BeautifulSoup [2] is very good. Openpyxl is the preferred cross-platform library for manipulating Excel spreadsheets; Openpyxl’s documentation can be found under footnote [3].

The program will need to go to the material properties page [4], convert the table into a nested list so the information is easily parse-able, and iterate through the info to put each value into a cell in an Excel spreadsheet.

About HTML

HTML Tag	What it represents
`<a>`	Hyperlink
`<div>, <p>`	Paragraphs of text
`<table>, <tr>, <td>, <tbody>, <thead>`	Table, table row, and table cell, etc
`<style>, <span>`	Other types of paragraphs or text
`<input>`	A field to input data
`<i>, <b>, <u>, <strike>, <sup>`	Font styles like italics, bold, underline, strikethrough, superscript, respectively
`<img>`	Image

When information on a website is not nicely formatted, you have to dive into the HTML to get the information ourselves. This means that you will have to use Requests to get the text of the website, which means getting all of the source code. Then, you give the source code to BeautifulSoup to more easily parse through the code to get the information that you want. This allows you to treat the source code text as lists, so you can get only <tr> tags, <a> tags, etc.

1.Getting Comfortable with Python

2.FizzBuzz

3.Graphing Thrust Available and Thrust Required

4.Graphing Dynamic Pressure During a Rocket Launch

5.Getting and Plotting Airfoil Coordinates

6.Modeling a 2-Body Orbit in 2D and 3D

7.Unit Conversions

8.Introduction to Web Scraping

9.Modeling Camera Shutter Effect

10.Writing Reports with Pweave

11.The End

Introduction to Material Properties

Material properties

About HTML