Four Programming Languages Creating a Complete Website Scraper Application

Nonfiction, Computers, Internet, Web Development, Java, Programming
Cover of the book Four Programming Languages Creating a Complete Website Scraper Application by Stephen J Link, Stephen J Link
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Stephen J Link ISBN: 9781311735225
Publisher: Stephen J Link Publication: September 6, 2014
Imprint: Smashwords Edition Language: English
Author: Stephen J Link
ISBN: 9781311735225
Publisher: Stephen J Link
Publication: September 6, 2014
Imprint: Smashwords Edition
Language: English

Four Programming Languages Creating a Complete Website Scraper Application

After finishing these pages you will have a complete application which will work for either console or desktop platform. You will be utilizing three languages - C#,VB.Net and Java for creating this application. Each chapter covers a single language and either the desktop or console application coded in that language (Java does not natively allow a console application, so it includes only Desktop). For console program automation purposes, we will be using an Excel sheet and VBA coding. Using the desktop application allows for more flexibility in web page processing, with entry fields for beginning and ending text along with DIVs and other processing options. Enjoy this learning experience.
This list includes some of the types/commands and the languages that use them

WebResponse, WebRequest, HttpWebRequest, StreamReader (C#/VB)
GetResponse, Regex.Replace, String.Replace, IndexOf (C#/VB)
Substring, ReadLine, Trim, WriteLine (C#/VB)
EndsWith, AddRange, ReadToEnd, Count (C#/VB)
GetCommandLineArgs, GetResponseStream (VB)
getText, endsWith, split, length, openConnection (Java)
toString, BufferedReader, getSelectedIndex, replaceAll (Java)
isEmpty, substring,indexOf, readLine, PrintWriter, write (Java)
ActiveCell,Value,ChDir,Shell,Activate (VBA)

Why would you want to work with the same program in multiple languages? A simple answer to this is "versatility." You may come across a need for Java where a .Net-based language just won't work. A perfect example of this is Windows versus Linux web hosting. If you have designed a .Net program and placed it on your site based on Windows, it will work beautifully. If you then change the hosting plan to Linux, the .Net program will not work without some tweaking or an interpreter. If that were written in Java, however, it would have moved over fine.
Why would you want a web site text extraction program? Well, if you had a need to capture the main text from a few web pages, this would be too much trouble. If you are migrating a web site designed in ASP.NET into another format, maybe a CMS, this approach can be quite useful. If you have 1,000 pages in the site and all are similarly structured, it may take a week for a single person to manually copy and paste the body text from these pages. Using the automated approach, with a pause between each page for accuracy purposes, approximately 700 pages per hour can be processed. That equates to a tremendous labor savings.

View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

Four Programming Languages Creating a Complete Website Scraper Application

After finishing these pages you will have a complete application which will work for either console or desktop platform. You will be utilizing three languages - C#,VB.Net and Java for creating this application. Each chapter covers a single language and either the desktop or console application coded in that language (Java does not natively allow a console application, so it includes only Desktop). For console program automation purposes, we will be using an Excel sheet and VBA coding. Using the desktop application allows for more flexibility in web page processing, with entry fields for beginning and ending text along with DIVs and other processing options. Enjoy this learning experience.
This list includes some of the types/commands and the languages that use them

WebResponse, WebRequest, HttpWebRequest, StreamReader (C#/VB)
GetResponse, Regex.Replace, String.Replace, IndexOf (C#/VB)
Substring, ReadLine, Trim, WriteLine (C#/VB)
EndsWith, AddRange, ReadToEnd, Count (C#/VB)
GetCommandLineArgs, GetResponseStream (VB)
getText, endsWith, split, length, openConnection (Java)
toString, BufferedReader, getSelectedIndex, replaceAll (Java)
isEmpty, substring,indexOf, readLine, PrintWriter, write (Java)
ActiveCell,Value,ChDir,Shell,Activate (VBA)

Why would you want to work with the same program in multiple languages? A simple answer to this is "versatility." You may come across a need for Java where a .Net-based language just won't work. A perfect example of this is Windows versus Linux web hosting. If you have designed a .Net program and placed it on your site based on Windows, it will work beautifully. If you then change the hosting plan to Linux, the .Net program will not work without some tweaking or an interpreter. If that were written in Java, however, it would have moved over fine.
Why would you want a web site text extraction program? Well, if you had a need to capture the main text from a few web pages, this would be too much trouble. If you are migrating a web site designed in ASP.NET into another format, maybe a CMS, this approach can be quite useful. If you have 1,000 pages in the site and all are similarly structured, it may take a week for a single person to manually copy and paste the body text from these pages. Using the automated approach, with a pause between each page for accuracy purposes, approximately 700 pages per hour can be processed. That equates to a tremendous labor savings.

More books from Programming

Cover of the book jQuery Hotshot by Stephen J Link
Cover of the book AJAX Interview Questions You'll Most Likely Be Asked by Stephen J Link
Cover of the book Advances in Service-Oriented and Cloud Computing by Stephen J Link
Cover of the book HCI International 2015 - Posters’ Extended Abstracts by Stephen J Link
Cover of the book SQL指令語法速查索引手冊 by Stephen J Link
Cover of the book Mastering Modular JavaScript by Stephen J Link
Cover of the book AngularJS by Stephen J Link
Cover of the book Compressed Sensing & Sparse Filtering by Stephen J Link
Cover of the book Essence of Systems Analysis and Design by Stephen J Link
Cover of the book TypeScript Programming By Example by Stephen J Link
Cover of the book Learn Red – Fundamentals of Red by Stephen J Link
Cover of the book Using Covariance Matrices as Feature Descriptors for Vehicle Detection from a Fixed Camera by Stephen J Link
Cover of the book Android Programming Made Easy For Beginners: Tutorial Book For Android Designers * New 2013 : Updated Android Programming And Development Tutorial Guide by Stephen J Link
Cover of the book VBA for Excel: Programming VBA Macros - The Easy Introduction for Beginners and Non-Programmers by Stephen J Link
Cover of the book Corona SDK Mobile Game Development: Beginner's Guide - Second Edition by Stephen J Link
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy