This exercise may be simple... but it isn't a bad place to get started. AND it can be developed into a wealth of useful tools. I wrote it along the road to providing myself with a way to see how the different pages at my website link to one another.
In this tutorial, you'll be able to check that you have Synapse installed, to support your Lazarus (and, I believe, Delphi) programming.
We're going to create a tiny application. The ser gives it the URL of a page on the World Wide Web, and the application goes off, looks at the page, and fetches back the HTML behind that page, and displays it in a memo.
Every web-browser I ever encountered offers something similar, of course... so why write your own?
Once you know how to get the web page's HTML into a memo, you have taken care of the biggest thing between you and a program to scan multiple pages for whatever you want. And that's just for starters.
The work will be done with Lazarus and the free IP/TPC library called Synapse. I have a page about why I like Synapse, if you haven't made your choice yet.
In any case, the application would serve as a quick test of your Synapse installation. (I've done notes on installing Synapse, so that your Lazarus installation has access to the tools Synapse provides.)
Start a new application... The "blank" project loaded when you tell Lazarus "File/ New/ Project/ Application" will suffice.
Name your form. I called my project LDN183 (Lazarus Demo, New series), and the main form LDN183f1. You'll see that, and variants in what follows. Either use the same name, of make adjustments, as necessary.
Save what you've got so far. I called the unit LDN183u1, and the project LDN183.
Add a button to the form, call it buQuit, create an OnClick handler which does application.terminate, or close.
Make sure that much works!
Put a memo on the page. Call it meDisplayPage. Set the Scrollbars property to AutoBoth.
Depending on what you are going to do to with what you can see once the HTML has been fetched, you may or may not want to set the memo's "WordWrap" property to true.
In the app's code, add...
... to the Uses clause
.. and run the application. The httpsend won't "do" anything yet... but if your setup isn't right yet, the compiler will complain. Might as well get that working now, fix one thing at a time.
Copy the following into the shell- of- a- program you've built so far. Put it just before the "end." (...the "end" at the end of the code!)
function DownloadHTTP(URL, TargetFile: string): Boolean; //From http://wiki.lazarus.freepascal.org/Synapse //Accessed 23 Oct 17 var HTTPGetResult: Boolean; HTTPSender: THTTPSend; begin Result := False; HTTPSender := THTTPSend.Create; try HTTPGetResult := HTTPSender.HTTPMethod('GET', URL); if (HTTPSender.ResultCode >= 100) and (HTTPSender.ResultCode<=299) then begin HTTPSender.Document.SaveToFile(TargetFile); Result := True; end; finally HTTPSender.Free; end; end;
... and put...
function DownloadHTTP(URL, TargetFile: string): Boolean;
... in one of the declarations sections of the class definition further up the page. I used the "private" section. The "public" would probably "do", too.
Go back to the long thing you inserted a moment ago, and add "TLDN183f1." in front of the "DownloadHTTP(URL, TargetFile...)" (Assuming you called your application LDN183, and the form LDN183f1, of course. Adapt as necessary, if you didn't.)
Now we have a framework in place.
Time to start exercising it...
Add a memo, call it meDisplayHTML
Add an edit box, call it eURL. Use the Object inspector to make eURL.text "https://sheepdogguides.com" (no quotes)
Add a button. Call it buFetch, make caption "Fetch Page".
Make it's OnClick handler...
procedure TLDN183f1.buFetchClick(Sender: TObject); var boTmp:boolean; begin boTmp:=DownloadHTTP(eURL.text, 'TmpPageFetch.htm'); end;
... and run the program. After you have run it, look in the folder your .exe is in. You should find a file called TmpPageFetch.htm, which wasn't there before. Double click, and it should open in your browser, being a local copy of what is out there on the internet.
Okay... not QUITE what I wanted... but a starting point.
I wanted the source of the page loaded to meDisplayPage.
But first... How did I know how to put the call of DownloadHTTP?
I didn't! but looking at the header...
DownloadHTTP(URL, TargetFile: string): Boolean;
... gave me BIG hints, and with a little fumbling, I got there!
The following is "better", albeit a bit "round the houses"...
procedure TLDN183f1.buFetchClick(Sender: TObject); var boTmp:boolean; begin boTmp:=DownloadHTTP(eURL.text, 'TmpPageFetch.htm'); meDisplayPage.Lines.LoadFromFile('TmpPageFetch.htm'); end;
I suspect there's a way to use filestreams for this, and avoid the disk thrash... but after two hours of failing to find that solution, I "moved on". (If you can show me how to do this with filestream objects, I'd be grateful!)
I struggled for ages, trying to get DownloadHTTP to write to a TFilestream object.
Wasn't finding much in the way of web-chatter about DownloadHTTP...
And then discovered...
.. and two minutes later had...
THE PROGRAM ISN'T "good" in current state... It doesn't check for, deal with "problem" situations... but it works, and the "other stuff" can be built on top of this foundation!("Problem" situations... e.g. the server you tried to access isn't responding... will be indicated by boTmp being returned false. I'm pretty sure there are ways to know what the problem was, too. (See the Lazarus wiki about using Synapse if you want to explore those topices.)
So... here's where I got to after my adventures....
A bit simpler than the previous one, and it won't thrash your hard drive!
procedure TLDN183f1.buFetchClick(Sender: TObject); //For help, see http://synapse.ararat.cz/doc/help/httpsend.html#HttpGetText var boTmp:boolean; begin boTmp:=HTTPGetText(eURL.text,meDisplayPage.lines); end;
I hope you've found the information here useful? Facebook "likes", mentions of the page in forums, etc make me happy and they also help others who'd like to know this sort of thing FIND this page...
Confused? I was, when I revisited the page July 2022, after last editing it in 2017.
Sorry. I lied.
All you need is the simple little thing at the end. The "HTTPGetText" solution.
You don't need DownloadHTTP to "go with" the HTTPGetText solution. Sorry.
What you see above is the tale of my "journey of discovery". Maybe it will have helped you "get there", too? And maybe you want a way to save to file? And we wouldn't have had the nice Filestreams discussion, either?
(The "skim-read it first" advice was added July 2022, when I saw what I'd done.)
The material outside this section, the section presented in blue, is about LDN148. It is "simple", but it "works". The tutorial about that is "done".
From LDN148, I created LDN147. (Yes: the one with the lower number is the newer, more advanced, version.
In many ways they are "the same". The user experience... q-when I get it working!... will be the same... except that LDN147 will return error codes, instead of a mere "pass/fail" evealuation of the attempt to fetch the webpage's HTML.
I am deeply indebted to https://wiki.freepascal.org/Synapse#Advanced_version, accessed 16 Jul 22 for the material I have generated here.
My function DoFetchPage; had to be extensively re-written. But that was 90% of the work. LDN148 was been written reasonably "well"... as demonstrated by the fact that revising it wasn't especially difficult.
Search across all my sites with the Google search...
To search THIS site.... (Go to my other sites, below, and use their search buttons if you want to search them.)
The search engine merely looks for the words you type, so....
* Spell them properly.
* Don't bother with "How do I get rich?" That will merely return pages with "how", "do", "I"....
In addition to the tutorials for which this page serves as Table of Contents, I have other sites with material you might find useful.....
My other sites....
Sheepdog Software homepage.
My Arunet homepage.
... and some links to specific pages within them you might want....
You can't "play" all day... learn to use the Libre Office/ Open Office database. Free. Multi-platform.
The Arduino- LOTS of fun, for not much money. And beginner (intelligent beginner) friendly. And good pursuit for kids. Combine programming and electronics!
Designing printed circuit boards the KiCad way. Free. Multi-platform. Long established. PCB-fab houses take native KiCad files.
And lastly... Making maps... how we did it before GPS Indulge me? This discusses a worthwhile, fun (if presented intelligently) activity for kids, which can be undertaken on many levels... a simple 20 minutes, or weeks of engaging activity. (Also known to divert susceptible adults.)
To email this page's editor, Tom Boyd.... Editor's email address. Suggestions welcomed! Please cite "LT-000-N-LET-BP.htm".
Page has been tested for compliance with INDUSTRY (not MS-only) standards, using the free, publicly accessible validator at validator.w3.org. Mostly passes.
If this page causes a script to run, why? Because of things like Google panels, and the code for the search button. Why do I mention scripts? Be sure you know all you need to about spyware.
....... P a g e . . . E n d s .....