Mostrar el registro sencillo del ítem
An Overview of Web Scraping: Technical Aspects and Exercises
dc.rights.license | All rights reserved | en_US |
dc.contributor.advisor | Duffany, Jeffrey | |
dc.contributor.author | Pérez Molano, Gustavo | |
dc.date.accessioned | 2024-01-11T13:19:49Z | |
dc.date.available | 2024-01-11T13:19:49Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | Pérez Molano, G. (2023). An Overview of Web Scraping: Technical Aspects and Exercises [Unpublished manuscript]. Graduate School, Polytechnic University of Puerto Rico. | en_US |
dc.identifier.uri | http://hdl.handle.net/20.500.12475/1995 | |
dc.description | Design Project Article for the Graduate Programs at Polytechnic University of Puerto Rico | en_US |
dc.description.abstract | Researchers and organizations conducting different types of research can benefit from studying and using Web Scraping in a correct manner to further their research goals. This study serves as a review on some of the web scraping techniques and the legal and ethical implications of web scraping. Technical, legal, and ethical aspects of web scraping are discussed to better understand benefits and risks of using the web scraping process. Three exercises involving Web Scraping techniques are presented. One is performed by using the BeautifulSoup library in Python. The second exercise is performed using the web scraping tool Octoparse. Lastly, web scraping is performed using ParseHub. The three experiences are discussed to provide insight on how the different techniques and programs compare. Key Terms ⎯ BeautifulSoup, Octoparse, ParseHub, Web scraping. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Polytechnic University of Puerto Rico | en_US |
dc.relation.ispartof | Computer Science; | |
dc.relation.ispartofseries | Fall-2023; | |
dc.relation.haspart | San Juan | en_US |
dc.subject.lcsh | Polytechnic University of Puerto Rico--Graduate students--Research | en_US |
dc.subject.lcsh | Data mining | en_US |
dc.subject.lcsh | Python (Computer program language) | |
dc.subject.other | Web scraping | |
dc.title | An Overview of Web Scraping: Technical Aspects and Exercises | en_US |
dc.type | Article | en_US |
dc.rights.holder | Polytechnic University of Puerto Rico, Graduate School | en_US |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
Computer Science
Artículos de Proyectos de Ciencias en Computadoras