Browser Automation.pdf

  • Uploaded by: Aries Lhi
  • 0
  • 0
  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Browser Automation.pdf as PDF for free.

More details

  • Words: 11,752
  • Pages: 1
Developer Network

Downloads

Sign in

Programs

Ask a question

Community

Subscriber portal

Get tools

Documentation Search forum questions

Search related threads

Quick access

browser automation

Answered by:

24,335 Points

Archived Forums V > Visual Basic Express Edition Question

Top 0.5%

 

Martin Xie - MSFT Joined Feb 2008 Martin Xie - MSFT… 5

6

11

Show activity

0 Sign in to vote

Hi i'd like to have a programm that navigates to http://www.handelsblatt.com/News/def...ymbol=FLUK.NWX selects "Times and Sales" from the menu "Darstellung", clicks on "aktualisieren" and copies the new table to a file. I'm still hoping i can deal with most of the steps, but I have no clue how to select from the dropdown menu. I'm using VB.NET 2005 express. I'd really appreciate any kind of help. Thank you!! Wednesday, November 28, 2007 11:07 AM

d.j.t

20 Points

Answers

 d.j.t wrote: And I'am not sure what you want  to tell me with:

1

e.g. Dim WithEvents Button1 As Button  

Sign in to vote

Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox. do I need to insert this code even though i added a button?

Because you said a error occured " Handles clause requires a WithEvents variable defined in the containing type or one of its base types ". The error has something to do with WithEvents. So that's only extra reference. You can ignore it.   Come back to the topic: Please drag&drop a Button control named Button1 to your Form. In this case, you have to click the button to perform the tasks. That's indeed restriction.   OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable as switch, which can ensure perform the tasks only once. Code Block Public Class Form1 Dim march As Boolean ' Set a swith Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load march = True ' Initialize the switch as True WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Dertermine the swith state If march = True Then 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") End If Next Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") w.Write(WebBrowser1.Document.Body.InnerHtml) w.Close() march = False ' If accomplish the task, change the switch to False. End If End Sub End Class

Wednesday, December 5, 2007 11:34 AM

Martin Xie - MSFT

24,335 Points

Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the

0

documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the

Sign in to vote

documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated!" Hi Dominik, In Part 6 you are extracting the javascript immediately after automatically clicking the More button without waiting for the next webpage to load with new data: Code Snippet 1. 'Part 6 Automatically click Continue link 2. Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a") 3. For Each curElement As HtmlElement In hrefElementCollection 4. Dim controlName As String = curElement.GetAttribute("id").ToString 5. If controlName.Contains("LBtn_More") Then 6. curElement.InvokeMember("Click") 7. End If 8. Next 9. extract() The code in my first post on this thread fixes that problem. The DocumentCompleted event fires when a new webpage loads. After clicking the button in Part 4 we have to wait for the next DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with clicking the More button in Part 6 (see: http://msdn2.microsoft.com/enus/library/system.windows.forms.webbrowser.documentcompleted.aspx): Code Snippet 1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 2. document_completed = document_completed + 1 3. If document_completed = 1 Then ' First table 4. Part2() ' Automatically select specified option from ComboBox 5. Part3() ' Automatically check the CheckBox 6. Part4() ' Automatically click the Button 7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables 8. Part5() ' Extract javascript and update last_datetime 9. If last_datetime > earliest_datetime Then 10. Part6() ' Click Continue Button 11. End If 12. End If 13. End Sub But the If statements need to be refined a bit because DocumentCompleted fires twice per page (once for the page banner and once for the default page containing the javascript data that we want): Code Snippet 1. If (document_completed < 3) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then 2. . 3. . 4. . 5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when generating the filename so there is potential for overwriting old files or appending new data to an old file: Code Snippet 1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss") Use a 24 hour clock instead using capital HH: Code Snippet 1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss") The other bugs I pointed out were "features" that I had introduced myself when converting from VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

Edited by Tim Mathias

Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.

Tuesday, January 29, 2008 10:24 AM

Tim Mathias

345 Points

> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same througout the whole procedure?

 

0

It's essential because the url DOESN'T stay the same throughout the whole procedure because the

Sign in to vote

webpage contains a link to a banner page that also calls the procedure after it loads. I've added a MessageBox to show these two URLs. It's this double message that causes the first table to be extracted in your skript (i.e. the table we want to ignore).

  I've also added an If statement that returns when the banner URL completes (it's a bit neater than the former If tests I wrote).

  And I've added the Me.Close ()

Code Snippet 1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri) 3. If Not (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then 4. Return 5. End If 6. document_completed = document_completed + 1 7. If document_completed = 1 Then ' First table 8. Part2() ' Automatically select specified option from ComboBox 9. Part3() ' Automatically check the CheckBox 10. Part4() ' Automatically click the Button 11. ElseIf document_completed > 1 Then 12. Part5() ' Extract javascript and update last_datetime 13. If last_datetime > earliest_datetime Then 14. Part6() ' Automatically click Continue Button 15. Else 16. Me.Close() ' Part 7: Close programme 17. End If 18. End If 19. End Sub Edited by Tim Mathias

Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.

Wednesday, January 30, 2008 2:42 PM

Tim Mathias

345 Points

I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops after you for a suspected DoS attack.

0

 

Sign in to vote

Here's the ultimate bug free code

(until you find the next one):

Code Snippet 1. Dim previous_last_datetime As DateTime 2. 3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri) 5. If Not (e.Url.AbsoluteUri = seite) Then 6. Return 7. End If 8. document_completed = document_completed + 1 9. If document_completed = 1 Then ' First table 10. Part2() ' Automatically select specified option from ComboBox 11. Part3() ' Automatically check the CheckBox 12. Part4() ' Automatically click the Button 13. ElseIf document_completed > 1 And document_completed < 11 Then 14. previous_last_datetime = last_datetime 15. Part5() ' Extract javascript and update last_datetime 16. If previous_last_datetime > last_datetime Then 17. Part6() ' Automatically click Continue Button 18. Else 19. Me.Close() ' Part 7: Close programme 20. End If 21. End If 22. End Sub Edited by Tim Mathias

Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.

Friday, February 1, 2008 7:04 PM

Tim Mathias

345 Points

All replies Hi d.j.t, Your question is related to Automation Test technology. The website you mentioned is a German website.

0 Sign in to vote

Here is the Introduction of one Web Application Testing in .Net. It allows you to emulate real users interacting with your web site by automating IE and bring you an easy way to automate tests with Internet Explorer. http://blogs.charteris.com/blogs/edwardw/archive/2007/07/16/watin-web-application-testing-in-netintroduction.aspx http://watin.sourceforge.net/ Check above documents for main idea of Web Automation Test. Basic features: Automates all major HTML elements Find elements by multiple attributes How to Locate elements

Creating test scripts in most cases involves finding an html element and either causing it to fire an event, set it's value or assert it's expected value. In order to perform an action against an element you must first obtain a reference to it. This can be done in 3 different ways: By the elements id (if it has one) Regular expression that matches the elements id Attribute class Regards, Martin

Edited by Pan Zhang

Friday, July 19, 2013 3:25 AM

Friday, November 30, 2007 8:53 AM

Martin Xie - MSFT

24,335 Points

Hi d.j.t,   I think I have worked it out.

0 Sign in to vote

We can locate and access elements of a webpage loaded in WebBrowser control. In your case, you want to select an option from ComboBox, check a CheckBox and click a Button.   1. Darstellung ComboBox element and Times & Sales Option: <SELECT class=wp1-input id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_DD_Step name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step> 2. The Kapitalmaßnahmen einbeziehen Checkbox element: 3. The Aktualisieren Button element: This code can automatically perform above steps: Code Block Public Class Form1 Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") ' javascript has a click method for we need to invoke on the current button element. End If Next End Sub End Class

Similar issue: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2456794&SiteID=1

  Best regards, Martin

Monday, December 3, 2007 4:00 AM

Martin Xie - MSFT

24,335 Points

 d.j.t wrote: ... and copies the new table to a file.

0

 

Sign in to vote

To achieve the task, here are two suggestions: 1. Code Block Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted ' After automatically clicking the button, ' append the following code to save the webpage as htm file Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") w.Write(WebBrowser1.Document.Body.InnerHtml) w.Close() End Sub

  1. Check this thread for detail: http://forums.microsoft.com/MSDN/ShowPost.aspx? PostID=2468541&SiteID=1 You need to Add Reference... ->  COM tab -> Find Microsoft CDO For Windows 2000 Library and Microsoft ActiveX Data Objects 2.5 Library and add them to your project Code Block Imports ADODB Imports CDO

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted ' After automatically clicking the button, ' append the following code to save the webpage as mht file SavePage(WebBrowser1.Url.ToString, "c:\table.mht") End Sub Private Sub SavePage(ByVal Url As String, ByVal FilePath As String) Dim iMessage As CDO.Message = New CDO.Message iMessage.CreateMHTMLBody(Url, CDO.CdoMHTMLFlags.cdoSuppressObjects, "", "") Dim adodbstream As ADODB.Stream = New ADODB.Stream adodbstream.Type = ADODB.StreamTypeEnum.adTypeText adodbstream.Charset = "US-ASCII" adodbstream.Open() iMessage.DataSource.SaveToObject(adodbstream, "_Stream") adodbstream.SaveToFile(FilePath, ADODB.SaveOptionsEnum.adSaveCreateOverWrite) End Sub

 

Monday, December 3, 2007 4:34 AM

Martin Xie - MSFT

24,335 Points

Hi Martin your first reply is great! Thanks a lot!

0

1. I just have one problem with the first task: when executing, the selection of the combo&checkboxes works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a webbrowser elemet from the toolbox in form1)

Sign in to vote

2. with the extraction i unfourtunately had problems too: " 'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical signatures "

naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that... But anyway, this only helped with the first solution, which only creates a html of the complete website (or at least parts of it). But i need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and lines) or .xls. So i tried the second solution (not really knowing what the output will be in that case, maybe more or less the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be converted to 'string'  " But if the exported file will be more then the pure table data (as i expect) the problem doesn't really matter. If you have an idea how to deal with one of the problems, especially the first, I'd appreciate if you could post it. My project has made a enormous progress thanks to you! Monday, December 3, 2007 2:59 PM

d.j.t

20 Points

Hi d.j.t,   1. "  'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical signatures  " naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that...

0 Sign in to vote

->  You should place the two part code (Automation part and Save page part) into the WebBrowser1_DocumentCompleted event. Don't name it as WebBrowser1_DocumentCompleted2. Code Block Public Class Form1   Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub   Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0)   End If Next   Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True)   'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") ' javascript has a click method for we need to invoke on the current button element. End If Next   ' After automatically clicking the button, ' append the following code to save the webpage as htm file Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") w.Write(WebBrowser1.Document.Body.InnerHtml) w.Close() End Sub

  End Class   2. So i tried the second solution (not really knowing what the output will be in that case, maybe more or less the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be converted to 'string'  " -> Please change it to WebBrowser1.Url.ToString. I have modified my third post.     This solution will save entire web page as .mht file which containing all text and images. It seems not to be what you expect.

Tuesday, December 4, 2007 2:41 AM

Martin Xie - MSFT

24,335 Points

3. I just have one problem with the first task: when executing, the selection of the combo&checkboxes works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a webbrowser elemet from the toolbox in form1) -> CAUSE: When clicking the button to retrieve data, it refresh and reload current page, so all the time it fires the WebBrowser1_DocumentCompleted event.

0 Sign in to vote

Solution: You can place that code in Button1_Click event. Code Block Public Class Form1 Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub   Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted MessageBox.Show("Complete loading webpage") ' Optional code End Sub   Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next   Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") ' javascript has a click method for we need to invoke on the current button element. End If Next   Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") w.Write(WebBrowser1.Document.Body.InnerHtml) w.Close() End Sub End Class   4. But I need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and lines) or .xls. But if the exported file will be more than the pure table data (as i expect) the problem doesn't really matter. -> You need to retrieve that part html code (...
) containing table data. Here are some references: 1) Using the HTML Parser to parse HTML code    http://www.developer.com/net/csharp/article.php/10918_2230091_2 2) See the Similar issue, you can use Regular Expressions to extract part html code. .NET Development » Regular Expressions Forum

  I'm glad to hear that you have made enormous progress. Cheers! Best regards, Martin

Tuesday, December 4, 2007 3:47 AM

Martin Xie - MSFT

24,335 Points

Hi Martin i tried to use the button1click event but a error  occured: " Handles clause requires a WithEvents variable defined in the containing type or one of its base types " Nevertheless, when excuting it, the same endless clicking of the refreshbutton happened... Thanks for your efforts! Dominik

0 Sign in to vote

Wednesday, December 5, 2007 9:56 AM

d.j.t

20 Points

d.j.t

20 Points

i'm just working on the extraction. - the first link is related to c# ... can i just change the language? - the similar issue seems to be excactly what i want but there is no complete code provided - the regular expressions thing - i appologize for this noob question - what is that? dominik

0 Sign in to vote

Wednesday, December 5, 2007 10:32 AM

 d.j.t wrote: Hi Martin i tried to use the button1click event but a error  occured: " Handles clause requires a WithEvents variable defined in the containing type or one of its base types "

0 Sign in to vote

Please directly drag&drop a Button control named Button1 to your Form.     Reference: WithEvents keyword http://msdn2.microsoft.com/en-us/library/aty3352y(VS.80).aspx Specifies that one or more declared member variables refer to an instance of a class that can raise events.   e.g. Dim WithEvents Button1 As Button

    Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox. Wednesday, December 5, 2007 10:38 AM

Martin Xie - MSFT

24,335 Points

Well I could have known it had something to do with a button on the form... sorry :-/ But now im really confuesed... cause now i have to click the button to perform the tasks. And I'am not sure what you want  to tell me with:

0

e.g. Dim WithEvents Button1 As Button

Sign in to vote

    Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.

do I need to insert this code even though i added a button? Well is there a possibility to solve that problem of the repetition by adding something like the following (in plain english) to the code you first recommended? "and if value of the combobox is not equal to 0?"

Wednesday, December 5, 2007 11:01 AM

d.j.t

20 Points

 d.j.t wrote: And I'am not sure what you want  to tell me with:

1

e.g. Dim WithEvents Button1 As Button  

Sign in to vote

Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox. do I need to insert this code even though i added a button?

Because you said a error occured " Handles clause requires a WithEvents variable defined in the containing type or one of its base types ". The error has something to do with WithEvents. So that's only extra reference. You can ignore it.   Come back to the topic: Please drag&drop a Button control named Button1 to your Form. In this case, you have to click the button to perform the tasks. That's indeed restriction.   OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable as switch, which can ensure perform the tasks only once. Code Block Public Class Form1 Dim march As Boolean ' Set a swith Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load march = True ' Initialize the switch as True WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Dertermine the swith state If march = True Then 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") End If Next Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") w.Write(WebBrowser1.Document.Body.InnerHtml) w.Close() march = False ' If accomplish the task, change the switch to False. End If End Sub End Class

Wednesday, December 5, 2007 11:34 AM

Martin Xie - MSFT

24,335 Points

Thank you! Thats exactly what i was trying to do (but lack of experience prevened me from doing so)! First task acomplished! So there remains the second task of extracting the table... even though - after you helped me so much i'm a bit embarressed to ask, did you see my questions concerning your links (regarding extraction) (Tuesday, 10:32 PM)?

0 Sign in to vote

 

Wednesday, December 5, 2007 3:47 PM

d.j.t

20 Points

d.j.t wrote: i'm just working on the extraction. - the first link is related to c# ... can i just change the language?

0

- the similar issue seems to be excactly what i want but there is no complete code provided

Sign in to vote

- the regular expressions thing - i appologize for this noob question - what is that? dominik

Yes, I see the second task of extracting the table. Regular Expressions can be used to extract part html code.

You need to Imports System.Text.RegularExpressions namespace. Suggest posting this task to Regular Expressions forum for quicker and better responses.

.NET Development » Regular Expressions Forum Please remember to point out the html page:

http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX Also point out the Table where you want to extract data as below: Code Block

Historische Daten
Datum Eröffnung Hoch Tief Schluss Volumen
05.12.07 15:23 57,60 59,90 57,60 59,90 3.753
04.12.07 18:29 57,90 58,10 57,27 57,50 4.730
03.12.07 18:57 58,50 58,75 57,39 57,85 10.219
30.11.07 14:43 57,95 58,75 57,95 58,46 12.249
29.11.07 14:52 58,45 58,75 58,00 58,00 1.532
28.11.07 14:17 57,70 58,23 57,58 58,23 1.540
27.11.07 16:08 58,60 58,92 57,30 57,60 7.683
26.11.07 14:09 58,30 59,00 58,30 58,90 5.321
23.11.07 19:10 57,15 57,74 57,15 57,50 8.880
22.11.07 19:48 57,60 57,60 56,51 56,51 9.393
21.11.07 19:23 58,30 58,80 56,90 57,00 7.971
20.11.07 15:12 58,05 58,80 57,07 58,80 5.601
19.11.07 15:23 58,70 59,35 57,60 57,95 6.562


By the way, convert C# code to VB.NET code by means of this Code Translator tool.

Thursday, December 6, 2007 3:16 AM

Martin Xie - MSFT

24,335 Points

Hi Martin! Well there is one last question (even though others might follow:-) that fits in this topic: How do i click the "weiter" button at the bottom of the table? I tried to do it the same way as clicking "refresh":      

0

_________________________________________________________________________ Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString

Sign in to vote

'Part 4: Automatically click the button If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click")

I tried to find the TagName and the attribute for the "weiter" link but it didnt work with what i found: "a" instead of "input" and "id" instead of "name" Once more I hope you can provide help. Thanks Dominik

Thursday, December 6, 2007 12:13 PM

d.j.t

20 Points

The following is complete code. Please check part 5: Automatically click Continue link. ("weiter" is translated to "Continue") Code Block Public Class Form1 Dim march As Boolean ' Set a swith

0 Sign in to vote

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load march = True ' Initialize the switch as True WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Dertermine the swith state If march = True Then 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") End If Next 'Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm") 'w.Write(WebBrowser1.Document.Body.InnerHtml) 'w.Close() march = False ' If accomplish the task, change the switch to False. Else ' If march = False, don't need to perform above tasks, directly continue to click "Continue" link. 'Part 5: Automatically click Continue link Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a") For Each curElement As HtmlElement In hrefElementCollection Dim controlName As String = curElement.GetAttribute("id").ToString If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then curElement.InvokeMember("Click") End If Next End If End Sub End Class

Friday, December 7, 2007 3:22 AM

Martin Xie - MSFT

24,335 Points

Hi Martin, thanks for the reference to the other forum, it was quite useful: somebody there could provide assistance!

0

i have to extend the question above:

Sign in to vote

This program is meant to be launched each day to copy the data. But due to holidays that wont be possible. And sometimes all data doesn't fit onto 1 page (as the tables on the concerned site are limited to 100 rows). Thats why I am thinking about a loop in the final part: After selecting, refreshing and copying, i'd like to have the "weiter" (next page) link clicked and the copying done again and again until a certain past date appears in the table. Like this 1. do selections and refresh 2. extract 3. click "weiter"(next page) (so far my above question) IF THE LAST DATE IN THE TABLE IS NOT MORE THAN x DAYS AGO (click link if: last_date_in_table > todays_date - x) 4. then go back to step 2 i'd be fine if the x could be a variable, selected in a form when starting the programm. but that should be rather  easy then. thanks for you commitment Dominik

edit: i just noticed your answer to my last question. many thanks! Friday, December 7, 2007 1:33 PM

d.j.t

20 Points

Hi with that code - thanks for it - the repetition in the end is happening again. I introduced a second switch and changed the final part to avoid this:         Else   

0

            If marchb = True Then

Sign in to vote

                'Part 5: Automatically click Continue link                 Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")                 For Each curElement As HtmlElement In hrefElementCollection                     Dim controlName As String = curElement.GetAttribute("id").ToString                     If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then                          curElement.InvokeMember("Click")                                              End If 'insert extraction once again  marchb = False  ' missing: if date as specified                 Next             End If        End If     End Sub  End Class

The task with the Date remains. I really appreciate your advice! Friday, December 7, 2007 2:08 PM

d.j.t

20 Points

Hi martin, -at the reg.ex. forum i was provided a lot of help but one Problem remains: I inserted the extraction where i had planed it, but it seems it happens to fast: the extracted table is the one displayed before refreshing. I hoped a few seconds pausing or another switch after the new table is completely loaded should do the trick, but my attempts have not been successfull yet.

0 Sign in to vote

-And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will run often, i'd like to have a changing date component and (for several pages a day) a counter in the filename. This is the complete code: Hi ok now i am puzzled once more: i finally tried the exporting but it did export the first table, the table that is displayed before the selection from the comboboxes is done. (but i need the table that is displayed after the comboboxselection). whats wrong? please have a look at my complete code. Thank you:

Imports System.IO Imports System.Text.RegularExpressions Public Class Form1

Dim lastDate As DateTime Dim marchb As Boolean

Dim march As Boolean

' Set a swith

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

march = True

' Initialize the switch as True

marchb = True

WebBrowser1.Dock = DockStyle.Fill

Me.WindowState = FormWindowState.Maximized

' Part 1: Use WebBrowser control to load web page

WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

'Dertermine the swith state

If march = True Then

'Part 2: Automatically select specified option from ComboBox

Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In theElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

curElement.SetAttribute("Value", 0)

End If

Next

'Part 2,5: Automatically select specified option from ComboBox

Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In the2ElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

curElement.SetAttribute("Value", 100)

End If

Next

Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

For Each curElement As HtmlElement In theWElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

'Part 3: Automatically check the CheckBox

If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

curElement.SetAttribute("Checked", True)

'Part 4: Automatically click the button

ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

curElement.InvokeMember("click")

End If

Next

'part 5 export 'java skript

Dim rows As New System.Collections.ObjectModel.Collection(Of String()) ()

Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+ (?:\\t))+([^\\]+(?=\\r\\n'))"

For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

Next

' export to txt march = False

' If accomplish the task, change the switch to False.

lastDate = Nothing

Dim lastDateStr As String = Nothing

Dim separator As String = vbTab

Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export.txt")

For Each row As String() In rows

sw.WriteLine(String.Join(separator, row)) lastDateStr = row(0)

Next

End Using

If lastDateStr IsNot Nothing Then

lastDate = DateTime.Parse(lastDateStr)

End If

Else

' If march = False, don't need to perform above tasks, directly

click Continue link.

If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont think that already works

'Part 6 Automatically click Continue link Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

For Each curElement As HtmlElement In hrefElementCollection

Dim controlName As String = curElement.GetAttribute("id").ToString

If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

curElement.InvokeMember("Click") 'extract again... yet to be inserted

End If marchb = False Next End If

End If

End Sub

End Class

Wednesday, December 12, 2007 9:17 AM

d.j.t

20 Points

Hi d.j.t,    Welcome back!

0 Sign in to vote

I'm glad to hear that you got much help from Regular Expressions forum.   "but it seems it happens to fast: the extracted table is the one displayed before refreshing." ->         'Delay 2 seconds             System.Threading.Thread.Sleep(2000) 'Call sub to extract ExportTableData()

"And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will run often, i'd like to have a changing date component and (for several pages a day) a counter in the filename."   ->  'Add current DataTime to file name to identify Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss") Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt") Thursday, December 13, 2007 2:58 AM

Martin Xie - MSFT

24,335 Points

This is complete code. The modified parts are marked in bold font. Code Block Imports System.IO Imports System.Text.RegularExpressions

1

Public Class Form1 Dim lastDate As DateTime Dim marchb As Boolean Dim march As Boolean ' Set a switch

Sign in to vote

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load march = True ' Initialize the switch as True marchb = True WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Maximized ' Part 1: Use WebBrowser control to load web page WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") End Sub Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 'Dertermine the swith state If march = True Then 'Part 2: Automatically select specified option from ComboBox Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In theElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then curElement.SetAttribute("Value", 0) End If Next 'Part 2,5: Automatically select specified option from ComboBox Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select") For Each curElement As HtmlElement In the2ElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then curElement.SetAttribute("Value", 100) End If Next Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input") For Each curElement As HtmlElement In theWElementCollection Dim controlName As String = curElement.GetAttribute("name").ToString 'Part 3: Automatically check the CheckBox If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then curElement.SetAttribute("Checked", True) 'Part 4: Automatically click the button ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then curElement.InvokeMember("click") End If Next march = False ' If accomplish the task, change the switch to False. 'Delay 2 seconds System.Threading.Thread.Sleep(2000) 'Call sub to extract ExportTableData() Else ' If march = False, don't need to perform above tasks, directly click Continue link. If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont think that already works 'Part 6 Automatically click Continue link Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a") For Each curElement As HtmlElement In hrefElementCollection Dim controlName As String = curElement.GetAttribute("id").ToString If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then curElement.InvokeMember("Click") 'Delay 2 seconds System.Threading.Thread.Sleep(2000) 'Call sub to extract again ExportTableData() End If marchb = False Next End If End If End Sub ' To be continue...

  Thursday, December 13, 2007 3:10 AM

Martin Xie - MSFT

24,335 Points

Code Block ' Continue   ' I put extract function code in custom method in order to be called conveniently. Public Sub ExportTableData() 'part 5 export 'java script Dim rows As New System.Collections.ObjectModel.Collection(Of String())() Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(? =\\r\\n'))" For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern) rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None)) Next

1 Sign in to vote

' export to txt lastDate = Nothing Dim lastDateStr As String = Nothing Dim separator As String = vbTab 'Add current DataTime to file name to identify Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss") Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt") For Each row As String() In rows sw.WriteLine(String.Join(separator, row)) lastDateStr = row(0) Next End Using If lastDateStr IsNot Nothing Then lastDate = DateTime.Parse(lastDateStr) End If End Sub End Class

   

Thursday, December 13, 2007 3:13 AM

Martin Xie - MSFT

24,335 Points

  Thanks for all those answers!!!! Just Great! i hope that with this i can finally finish my task! Loads of thanks!

0

Thursday, December 13, 2007 10:58 AM

Sign in to vote

d.j.t

20 Points

Hi Martin, finally i have a complete working code doing exactly what i want. Big thanks to you! i have some questions still but they are mere "cosmetics".

0 Sign in to vote

-With that code the first table is copied twice. I dont really understand why... -Can it easyly be done, that the user doesnt notice anything else of the execution of the skript once it is executed. I mean no window, no sounds... -I'd like that programm to be used not only for one stock, but for several (up to 100). So i could just change the adress in the first sub and create a executable programm for each stock. Then write few lines that make all those programms be executed. I think this should even be possible at the same time.??. Well of course i'd would be more elegant if i didnt need to create so many single programms . is there an conviniently easy way to do this in the skipt? Thanks! Dominik Ps: Skript in next post... cant post it in color... (dont ask me why, the forum always refuses to accept (unknown error))  

Friday, December 14, 2007 1:23 PM

d.j.t

20 Points

Imports System.IO Imports System.Text.RegularExpressions

0

Public Class Form1

Sign in to vote

    Dim lastDate As DateTime     Dim marchb As Boolean     Dim marchc As Boolean     Dim march As Boolean  ' Set a swith     Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load         march = True  ' Initialize the switch as True         marchc = True

        WebBrowser1.Dock = DockStyle.Fill         Me.WindowState = FormWindowState.Maximized

        ' Part 1: Use WebBrowser control to load web page             WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=EAD.ETR")     End Sub     Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted         'Dertermine the swith state         If march = True Then             'Part 2: Automatically select specified option from ComboBox             Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")             For Each curElement As HtmlElement In theElementCollection                 Dim controlName As String = curElement.GetAttribute("name").ToString                 If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then                     curElement.SetAttribute("Value", 0)                 End If             Next

            'Part 2,5: Automatically select specified option from ComboBox             Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")             For Each curElement As HtmlElement In the2ElementCollection                 Dim controlName As String = curElement.GetAttribute("name").ToString                 If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then                     curElement.SetAttribute("Value", 100)                 End If             Next

            Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")             For Each curElement As HtmlElement In theWElementCollection                 Dim controlName As String = curElement.GetAttribute("name").ToString                 'Part 3: Automatically check the CheckBox                 If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then                     curElement.SetAttribute("Checked", True)                     'Part 4: Automatically click the button                 ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then                     curElement.InvokeMember("click")                     march = False  ' If accomplish the task, change the switch to False.                 End If             Next

        Else             If marchc = True And march = False Then   ' If march = False, don't need to perform above tasks, directly click Continue link.                 'part 5 export                 extract()                 marchc = False

            End If         End If

        If marchc = False And lastDate > Today.AddDays(-2) Then ' im not sure if that works             'Part 6 Automatically click Continue link             Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")             For Each curElement As HtmlElement In hrefElementCollection                 Dim controlName As String = curElement.GetAttribute("id").ToString                 If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then                     curElement.InvokeMember("Click")                 End If             Next             extract()             'ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-2) Then : Close() 'just good to know...         End If     End Sub     Public Sub extract()         Dim rows As New System.Collections.ObjectModel.Collection(Of String())()         Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"         For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)             rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))         Next

        ' export to txt         lastDate = Nothing         Dim lastDateStr As String = "0"         Dim separator As String = vbTab         Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")         Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt")             For Each row As String() In rows                 sw.WriteLine(String.Join(separator, row))                 lastDateStr = row(0)             Next         End Using         If lastDateStr IsNot "0" Then             lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))             System.Threading.Thread.Sleep(1000)         End If     End Sub

Tongue Tied s",

End Class

Friday, December 14, 2007 1:30 PM

d.j.t

20 Points

"im not sure if that works" Try this:

0

Code Snippet 1. Public Class Form1 2. Dim document_completed As Integer 3. Dim last_datetime As DateTime 4. Dim earliest_datetime As DateTime 5. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load 6. WebBrowser1.Dock = DockStyle.Fill 7. Me.WindowState = FormWindowState.Maximized 8. Part1() ' Use WebBrowser control to load web page 9. End Sub 10. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 11. document_completed = document_completed + 1 12. If document_completed = 1 Then ' First table 13. Part2() ' Automatically select specified option from ComboBox 14. Part3() ' Automatically check the CheckBox 15. Part4() ' Automatically click the Button 16. ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables 17. Part5() ' Extract javascript and update last_datetime 18. If last_datetime > earliest_datetime Then 19. Part6() ' Click Continue Button 20. End If 21. End If 22. End Sub 23. Private Sub Part1() 24. ' Part 1: Use WebBrowser control to load web page 25. document_completed = 0 26. last_datetime = DateTime.Now 27. earliest_datetime = last_datetime.AddDays(-2) 28. WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") 29. End Sub 30. Private Sub Part2() 31. ' Part 2: Automatically select specified option from ComboBox 32. End Sub 33. Private Sub Part3() 34. ' Part 3: Automatically check the CheckBox 35. End Sub 36. Private Sub Part4() 37. ' Part 4: Automatically click the Button 38. End Sub 39. Private Sub Part5() 40. ' Part 5: Extract javascript and update last_datetime 41. End Sub 42. Private Sub Part6() 43. ' Part 6: Click Continue Button 44. End Sub 45. End Class

Sign in to vote

Edited by Tim Mathias

Wednesday, October 14, 2009 6:25 PM Reformatted code snippet.

Friday, January 25, 2008 6:06 AM

Tim Mathias

345 Points

Not forgetting Part 7 from this thread http://forums.microsoft.com/msdn/showpost.aspx? postid=2514450&siteid=1&sb=0&d=1&at=7&ft=11&tf=0&pageid=2 Code Snippet 1. If last_datetime > earliest_datetime Then 2. Part6() ' Click Continue Button 3. Else 4. Me.Close() ' Part 7: Close programme 5. End If

0 Sign in to vote

Edited by Tim Mathias

Wednesday, October 14, 2009 6:10 PM Reformatted code snippet.

Friday, January 25, 2008 6:22 AM

Tim Mathias

345 Points

Hi Dominik, I found a couple of bugs in Part 5 when I tried it out in C++ (I'm a C++ man not a VB one). I've highlighted the important changes in bold (namely -- 24 hour clock, closed the output file

0

immediately after writing to it, and parsing a 15 character substring for the last datetime). (I've also used GetElementById to get straight to the point.)

Sign in to vote

With the original version, ParseExact threw an exception every time, leaving the output file open and empty. Maybe this is what is causing you stability issues with VB. Code Snippet 1. void Part1 () 2. { 3. Trace::WriteLine ("Part 1"); 4. 5. // Part 1: Use WebBrowser control to load web page 6. document_completed = 0; 7. last_datetime = DateTime::Now; 8. earliest_datetime = last_datetime.AddDays (-2.0); 9. webBrowser1->DocumentCompleted += gcnew WebBrowserDocumentCompletedEventHandler (this, &Form1::DocumentCompleted); 10. webBrowser1->Navigate ("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX"); 11. } 12. 13. void Part2 () 14. { 15. Trace::WriteLine ("Part 2"); 16. 17. // Part 2: Automatically select specified option from ComboBox 18. HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Step"); 19. el->SetAttribute ("value", "0"); 20. } 21. 22. void Part3 () 23. { 24. Trace::WriteLine ("Part 3"); 25. 26. // Part 3: Automatically check the CheckBox 27. HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_CBx_CapitalMeasures"); 28. el->SetAttribute ("checked", "true"); 29. } 30. 31. void Part4 () 32. { 33. Trace::WriteLine ("Part 4"); 34. 35. // Part 4: Automatically click the button 36. HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_IBtn_Refresh1"); 37. el->InvokeMember ("click"); 38. } 39. 40. void Part5 () 41. { 42. Trace::WriteLine ("Part 5"); 43. 44. // Part 5: Extract javascript and update last_datetime 45. try 46. { 47. ArrayList ^rows = gcnew ArrayList ();; 48. Regex ^pattern = gcnew Regex ("(?<=myl\\+=\\')([^\\\\]+(?:\\\\t))+ ([^\\\\]+(?=\\\\r\\\\n'))"); 49. Trace::WriteLine ("Part 5: pattern = " + pattern); 50. MatchCollection ^matches = pattern->Matches (webBrowser1>DocumentText); 51. Trace::WriteLine ("Part 5: matches->Count = " + matches->Count); 52. array <String^> ^tab = { gcnew String ("\\t") }; 53. for (int i = 0; i < matches->Count; i++) 54. { 55. Trace::WriteLine (matches [i]->Value); 56. rows->Add (String::Join ("\t", matches [i]->Value->Split (tab, StringSplitOptions::None))); 57. Trace::WriteLine (rows [i]); 58. } 59. String ^current_datetime = DateTime::Now.ToString ("yyyyMMddHHmmss"); // 24 hour clock 60. StreamWriter ^file = gcnew StreamWriter ("BrowserAutomation" + current_datetime + ".txt"); 61. for (int i = 0; i < rows->Count; i++) 62. { 63. file->WriteLine (rows [i]); 64. } 65. file->Close (); 66. 67. String ^str_last_datetime = (String ^) rows [rows->Count - 1]; 68. Trace::WriteLine ("str_last_datetime = " + str_last_datetime); 69. last_datetime = DateTime::ParseExact (str_last_datetime->Substring (0, 15), "dd.MM. HH:mm:ss", System::Globalization::CultureInfo::CreateSpecificCulture ("de-de")); 70. Trace::WriteLine ("last_datetime = " + last_datetime); 71. } 72. catch (Exception ^e) 73. { 74. Trace::WriteLine ("Part 5: " + e->Message); 75. } 76. } 77. 78. void Part6 () 79. { 80. Trace::WriteLine ("Part 6"); 81. 82. // Part 6: Click Continue Button 83. HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_LBtn_More"); 84. el->InvokeMember ("click"); 85. } Edited by Tim Mathias

Wednesday, October 14, 2009 6:20 PM Reformatted code snippet.

Friday, January 25, 2008 10:37 PM

Tim Mathias

345 Points

Hi thanks for your posts but as this is my first skript and therefore my programming experience is near zero, i dont know how i would have to translate your skript to vb.net. or do you propose to change to c++? well i've only used vb.net up to now.

0 Sign in to vote

nevertheless i made some changes within my code (namely i put: add.days(-1) everywhere where i had different numbers before) and now it seems to work. well this programm is supposed to run on an old win2000sp4 computer that is not used for anything else, so nobody can interfere. but after all was working fine on the (more or less new) win xpcomputer, on which i wrote the whole thing, it is not working that fine on the old win2000sp4 computer. what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated! is there anyone who has an idea why this could be? i post the whole code once again.... Thanks Dominik Monday, January 28, 2008 2:22 PM

d.j.t

20 Points

Imports System.IO

0

Imports System.Text.RegularExpressions

Sign in to vote

Public Class Form1

Dim lastDate As DateTime

Dim marchc As Boolean

Dim march As Boolean' set 2 switches

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load Me.Visible = False march = True' Initialize the switches as True

marchc = True

WebBrowser1.Dock = DockStyle.Fill

Me.WindowState = FormWindowState.Maximized

' Part 1: Use WebBrowser control to load web page

WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=SAP.ETR")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

'Dertermine the swith state 'Me.Visible = False ' egal If march = True Then

'Part 2: Automatically select specified option from ComboBox

Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In theElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName.contains("DD_Step") Then

curElement.SetAttribute("Value", 0)

End If

Next

'Part 2,5: Automatically select specified option from ComboBox

Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In the2ElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName.contains("DD_Lines") Then

curElement.SetAttribute("Value", 100)

End If

Next

Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

For Each curElement As HtmlElement In theWElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

'Part 3: Automatically check the CheckBox

If controlName.contains("CBx_CapitalMeasures") Then

curElement.SetAttribute("Checked", True)

'Part 4: Automatically click the button

ElseIf controlName.contains("IBtn_Refresh1") Then

curElement.InvokeMember("click")

march = False' If accomplish the task, change the switch1 to False.

End If

Next

Else

If marchc = True And march = False Then ' If march = False, don't need to perform above tasks, directly click Continue link.

'part 5 export extract()

marchc = False

End If

End If

If marchc = False And lastDate > Today.AddDays(-1) Then ' im not sure if that works

'Part 6 Automatically click Continue link

Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

For Each curElement As HtmlElement In hrefElementCollection

Dim controlName As String = curElement.GetAttribute("id").ToString

If controlName.Contains("LBtn_More") Then

curElement.InvokeMember("Click")

End If

Next extract() 'part 7 close program ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-1) Then

Me.Close()

End If

End Sub 'sub to extract Public Sub extract() Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+ ([^\\]+(?=\\r\\n'))"

For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

Next

' export to txt

lastDate = Nothing

Dim lastDateStr As String = "0"

Dim separator As String = vbTab

Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

Using sw As StreamWriter = File.CreateText("C:\abc\def\etr" & currentDataTime & ".txt")

For Each row As String() In rows

sw.WriteLine(String.Join(separator, row))

lastDateStr = row(0)

Next

End Using If lastDateStr IsNot "0" Then

lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm

Tongue Tied s",

System.Globalization.CultureInfo.CreateSpecificCulture("de-de")) System.Threading.Thread.Sleep(2000)

End If End Sub

End Class

 

Monday, January 28, 2008 2:23 PM

d.j.t

20 Points

Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the

0

documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the

Sign in to vote

documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated!" Hi Dominik, In Part 6 you are extracting the javascript immediately after automatically clicking the More button without waiting for the next webpage to load with new data: Code Snippet 1. 'Part 6 Automatically click Continue link 2. Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a") 3. For Each curElement As HtmlElement In hrefElementCollection 4. Dim controlName As String = curElement.GetAttribute("id").ToString 5. If controlName.Contains("LBtn_More") Then 6. curElement.InvokeMember("Click") 7. End If 8. Next 9. extract() The code in my first post on this thread fixes that problem. The DocumentCompleted event fires when a new webpage loads. After clicking the button in Part 4 we have to wait for the next DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with clicking the More button in Part 6 (see: http://msdn2.microsoft.com/enus/library/system.windows.forms.webbrowser.documentcompleted.aspx): Code Snippet 1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 2. document_completed = document_completed + 1 3. If document_completed = 1 Then ' First table 4. Part2() ' Automatically select specified option from ComboBox 5. Part3() ' Automatically check the CheckBox 6. Part4() ' Automatically click the Button 7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables 8. Part5() ' Extract javascript and update last_datetime 9. If last_datetime > earliest_datetime Then 10. Part6() ' Click Continue Button 11. End If 12. End If 13. End Sub But the If statements need to be refined a bit because DocumentCompleted fires twice per page (once for the page banner and once for the default page containing the javascript data that we want): Code Snippet 1. If (document_completed < 3) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then 2. . 3. . 4. . 5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when generating the filename so there is potential for overwriting old files or appending new data to an old file: Code Snippet 1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss") Use a 24 hour clock instead using capital HH: Code Snippet 1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss") The other bugs I pointed out were "features" that I had introduced myself when converting from VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

Edited by Tim Mathias

Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.

Tuesday, January 29, 2008 10:24 AM

Tim Mathias

345 Points

Hi Tim, thanks for your comprehensive explanations! I think with the structure you are adviceing it should work a lot better than what i had before. one thing i still dont understand is why my skript not only extracts the "old table" but also the new one... well but that doesnt matter.

0 Sign in to vote

First i wondered whether this would allow not more then 10 tables ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables But i see this part needs to be changed to what you wrote so this restriction drops out: ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then Is it exactly necessary to mention e.Url.AbsoluteUri = ...

because the url stays the same througout the

whole procedure? Well, as i am doing this while studying i cant implement all your advices right now, but i'll do so soon and report my progress! Thanks a lot! Dominik

Wednesday, January 30, 2008 10:23 AM

d.j.t

20 Points

Hi i just tried it, works fine! Just the me.close part is missing but no time left now, will continue next fryday. Thanks a lot!!!!!! Dominik Wednesday, January 30, 2008 11:10 AM

0 Sign in to vote

d.j.t

20 Points

> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same througout the whole procedure?

 

0

It's essential because the url DOESN'T stay the same throughout the whole procedure because the

Sign in to vote

webpage contains a link to a banner page that also calls the procedure after it loads. I've added a MessageBox to show these two URLs. It's this double message that causes the first table to be extracted in your skript (i.e. the table we want to ignore).

  I've also added an If statement that returns when the banner URL completes (it's a bit neater than the former If tests I wrote).

  And I've added the Me.Close ()

Code Snippet 1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri) 3. If Not (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then 4. Return 5. End If 6. document_completed = document_completed + 1 7. If document_completed = 1 Then ' First table 8. Part2() ' Automatically select specified option from ComboBox 9. Part3() ' Automatically check the CheckBox 10. Part4() ' Automatically click the Button 11. ElseIf document_completed > 1 Then 12. Part5() ' Extract javascript and update last_datetime 13. If last_datetime > earliest_datetime Then 14. Part6() ' Automatically click Continue Button 15. Else 16. Me.Close() ' Part 7: Close programme 17. End If 18. End If 19. End Sub Edited by Tim Mathias

Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.

Wednesday, January 30, 2008 2:42 PM

Tim Mathias

345 Points

Thanks a lot i! i think now i understand the documentcompleted structure better! I'll test this skipt, but i think still there is one problem:

0 Sign in to vote

If the last date in the table is yesterday, the scipt will click "more/next table"("weiter") to get the next table. Now sometimes there is no futher information [because the intraday-data i need is saved for only 5 days or so]. Then when clicking on "more/next table" the same table is loaded again, as there is no next table. In that case the program will endlessly repeat the re-loading and extraction of that table. [With my data this is extremely unlikely to happen, but it happend for the first time in 2 weeks yesterday so i got the same file a thousand times and the skript (the former one) ran for like 12 hours until it crashed]. What i thought of to solve this problem was to save the lastdate for one turn so that the next time we can compare if the last date has changed. So we need the lastdate of the previous and the pre-previous table. It can probably be done easier. So don't continue reading if you have an easy solution.

EDIT: i found an easyer way so dont read the second snipplet: EDIT 2: tried it on http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX, didnt work totally correct: it produced 2 times the same file with this link.  (but still better than infinite times! Wink Dim previouslastdate As DateTime Private Sub Form1_Load ... previouslastdate = DateTime.Now.AddDays(-1000) WebBrowser1.Dock = DockStyle.Fill Me.WindowState = FormWindowState.Minimized Part1() ' Use WebBrowser control to load web page End Sub Private Sub WebBrowser1_DocumentCompleted... document_completed = document_completed + 1 If (document_completed < 3) And (e.Url.AbsoluteUri = Seite) Then ' First table Part2() ' Automatically select specified option from ComboBox Part3() ' Automatically check the CheckBox Part4() ' Automatically click the Button ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = Seite) Then ' Second to xth tables previouslastdate = lastdate Part5() ' Extract javascript and update last_datetime If lastdate > earliest_datetime And lastdate <> previouslastdate Then Part6() ' Click Continue Button Else Me.Close() ' Part 7: Close programme End If End If End Sub

But anyway,m y idea was therefore to save the lastdate every second time into a new variable. my idea was to determine if it is the second time by counting the docment_completed events: i understand we get this event 4 times whithin 2 turns . So here the code... just didnt know how to determine if a variable is an integer...

Insert in the part 5 sub ... dim checkdate as datetime1 dim checkdate as datetime2 lastDate = Nothing Dim lastDateStr As String = "0" Dim separator As String = vbTab Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss") Using sw As StreamWriter = File.CreateText(Pfad & currentDataTime & ".txt") For Each row As String() In rows sw.WriteLine(String.Join(separator, row)) lastDateStr = row(0) Next End Using If lastDateStr IsNot "0" Then lastdate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm

Tongue Tied

s", System.Globalization.CultureInfo.CreateSpecificCulture("de-de")) If document_completed / 4 gives an integer Then checkdate1 = lastdate checkdate2 = 0 Else checkdate2 = last date checkdate1 = 0 End If System.Threading.Thread.Sleep(2000) End If ...

and insert in the document completed sub ... ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then ' Second to xth tables Part5() ' Extract javascript and update last_datetime If lastdate > earliest_datetime And document_completed / 4 gives an integer and checkdate2

<> lastdate Then

Part6() ' Click Continue Button     ElseIf lastdate > earliest_datetime And document_completed / 4 does not give an integer and checkdate1

<> lastdate Then

Part6() ' Click Continue Button Else Me.Close() ' Part 7: Close programme End If End If

... Friday, February 1, 2008 2:08 PM

d.j.t

20 Points

I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops after you for a suspected DoS attack.

 

0 Sign in to vote

Here's the ultimate bug free code

(until you find the next one):

Code Snippet 1. Dim previous_last_datetime As DateTime 2. 3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri) 5. If Not (e.Url.AbsoluteUri = seite) Then 6. Return 7. End If 8. document_completed = document_completed + 1 9. If document_completed = 1 Then ' First table 10. Part2() ' Automatically select specified option from ComboBox 11. Part3() ' Automatically check the CheckBox 12. Part4() ' Automatically click the Button 13. ElseIf document_completed > 1 And document_completed < 11 Then 14. previous_last_datetime = last_datetime 15. Part5() ' Extract javascript and update last_datetime 16. If previous_last_datetime > last_datetime Then 17. Part6() ' Automatically click Continue Button 18. Else 19. Me.Close() ' Part 7: Close programme 20. End If 21. End If 22. End Sub Edited by Tim Mathias

Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.

Friday, February 1, 2008 7:04 PM

Tim Mathias

345 Points

  I've had a deeper look at the website's pagination problem. I've separated the reading of the table rows from the writing of the table rows -- Part5A and Part5B. I've also added a new variable --

1

more_data -- to test whether the next table is really more data or just a repeat of the last table. If you want you can also add a time limit to this test -- earliest_datetime -- as we had before.

Sign in to vote

  Currently (at time of writing this post) there's still a mysterious problem with that particular website with a double entry:

30.01. 17:15:08 30.01. 17:15:08

47,80 47,70

Handel Handel

1.000 1.000

If you select 20 lines per page the latter of these entries disappears.

  Here's the code:

Code Snippet 1. Imports System.IO 2. Imports System.Text.RegularExpressions 3. 4. Public Class Form1 5. 6. Dim seite As Uri 7. Dim document_completed As Integer 8. Dim last_datetime As DateTime 9. Dim rows As ArrayList 10. Dim more_data As Boolean 11. 12. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load 13. Trace.WriteLine(vbCrLf & vbCrLf & "Form1_Load") 14. Me.WindowState = FormWindowState.Maximized 15. Part1() ' Use WebBrowser control to load web page 16. End Sub 17. 18. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted 19. Trace.WriteLine(vbCrLf & "WebBrowser1_DocumentCompleted url = " & e.Url.ToString) 20. If (e.Url <> seite) Then 21. Return ' Ignore banner page load 22. End If 23. document_completed = document_completed + 1 24. Trace.WriteLine(vbCrLf & "document_completed = " & document_completed & vbCrLf) 25. If document_completed = 1 Then ' First table 26. Trace.WriteLine(vbCrLf & "Section A" & vbCrLf) 27. Part2() ' Automatically select specified options from ComboBoxes 28. Part3() ' Automatically check the CheckBox 29. Part4() ' Automatically click the Button 30. ElseIf more_data And document_completed < 11 Then 31. Trace.WriteLine(vbCrLf & "Section B" & vbCrLf) 32. Part5A() ' Read javascript table rows and update more_data 33. If more_data Then 34. Part6() ' Automatically click More Button 35. Else 36. Part5B() ' Write combined table rows to file 37. Close() ' Part 7: Close programme 38. End If 39. Else 40. Trace.WriteLine("Too many tables.") 41. Part5B() ' Write combined table rows to file 42. Close() ' Part 7: Close programme 43. End If 44. End Sub 45. 46. Private Sub Part1() 47. ' Part 1: Use WebBrowser control to load web page 48. Trace.WriteLine("Part1: Use WebBrowser control to load web page") 49. seite = New Uri("http://www.handelsblatt.com/News/default.aspx? _p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") 50. document_completed = 0 51. last_datetime = DateTime.Now 52. rows = New ArrayList 53. more_data = True 54. WebBrowser1.Dock = DockStyle.Fill 55. WebBrowser1.Navigate(seite) 56. End Sub 57. 58. Private Sub Part2() 59. ' Part 2: Automatically select specified options from ComboBoxes 60. Trace.WriteLine("Part2: Automatically select specified options from ComboBoxes") 61. Try 62. ' Part 2A: Times & Sales 63. Dim el1 As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04 _DD_Step") 64. el1.SetAttribute("value", "0") 65. 66. ' Part 2B: 100 lines 67. Dim el2 As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04 _DD_Lines") 68. el2.SetAttribute("value", "100") 69. Catch e As Exception 70. Trace.WriteLine("ERROR: Part2: " & e.Message) 71. Close() 72. End Try 73. End Sub 74. 75. Private Sub Part3() 76. ' Part 3: Automatically check the CheckBox 77. Trace.WriteLine("Part3: Automatically check the CheckBox") 78. Try 79. Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04 _CBx_CapitalMeasures") 80. el.SetAttribute("checked", "true") 81. Catch e As Exception 82. Trace.WriteLine("ERROR: Part3: " & e.Message) 83. Close() 84. End Try 85. End Sub 86. 87. Private Sub Part4() 88. ' Part 4: Automatically click the Button 89. Trace.WriteLine("Part4: Automatically click the Button") 90. Try 91. Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04 _IBtn_Refresh1") 92. el.InvokeMember("click") 93. Catch e As Exception 94. Trace.WriteLine("ERROR: Part4: " & e.Message) 95. Close() 96. End Try 97. End Sub 98. 99. Private Sub Part5A() 100. ' Part 5A: Read javascript table rows and update more_data 101. Trace.WriteLine("Part5A: Read javascript table rows and update more_data") 102. Try 103. Dim new_rows As New ArrayList 104. Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ") ([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))" 105. Dim separator As String = vbTab 106. For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern) 107. new_rows.Add(String.Join(separator, m.Value.Split(New String() {"\t"}, StringSplitOptions.None))) 108. Trace.WriteLine(new_rows(new_rows.Count - 1)) 109. Next 110. Dim str_new_last_datetime As String = new_rows(new_rows.Count 1) 111. Dim new_last_datetime As DateTime 112. new_last_datetime = DateTime.ParseExact(str_new_last_datetime.Substring(0, 15), "dd.MM. HH:mm:ss", System.Globalization.CultureInfo.CreateSpecificCulture("de-de")) 113. If (new_last_datetime < last_datetime) Then 114. Trace.WriteLine("Adding " & new_rows.Count & " new row(s) to combined rows.") 115. rows.AddRange(new_rows) 116. last_datetime = new_last_datetime 117. Else 118. Trace.WriteLine("Skipping new row(s).") 119. more_data = False 120. End If 121. Catch e As Exception 122. Trace.WriteLine("ERROR: Part5A: " & e.Message) 123. Part5B() ' Save any accrued data 124. Close() 125. End Try 126. End Sub 127. 128. Private Sub Part5B() 129. ' Part 5B: Write combined table rows to file 130. Trace.WriteLine("Part5B: Write combined table rows to file") 131. If rows.Count Then 132. Try 133. Dim current_datetime As String = DateTime.Now.ToString("yyyyMMddHHmmss") ' 24 hour clock 134. Trace.WriteLine("Writing " & rows.Count & " row(s) to file...") 135. Using sw As StreamWriter = File.CreateText("BrowserAutomation" & current_datetime & ".txt") 136. For Each row As String In rows 137. sw.WriteLine(row) 138. Next 139. End Using 140. Trace.WriteLine("Done.") 141. Catch e As Exception 142. Trace.WriteLine("ERROR: Part5B: " & e.Message) 143. Close() 144. End Try 145. Else 146. Trace.WriteLine("No data to write.") 147. End If 148. End Sub 149. 150. Private Sub Part6() 151. ' Part 6: Automatically click More Button 152. Trace.WriteLine("Part 6: Automatically click More Button") 153. System.Threading.Thread.Sleep(2000) 154. Try 155. Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04 _LBtn_More") 156. el.InvokeMember("click") 157. Catch e As Exception 158. Trace.WriteLine("ERROR: Part4: " & e.Message) 159. Part5B() ' Save any accrued data 160. Close() 161. End Try 162. End Sub 163. 164. End Class Edited by Tim Mathias

Wednesday, October 14, 2009 5:22 PM Reformatted code snippet.

Monday, February 4, 2008 10:48 AM

Tim Mathias

345 Points

Hi Tim thanks for both your posts! I implemented the first post and it did work.

0

The missing lines on the website - we probably cant do anything about that but that shouldt matter i hope.

Sign in to vote

Now your second post looks really scaring. There commands you use are totally different! Id like to understand all that, but at the moment i just have no time as i am studying and exams are held next week and then i'll be away for a while. But thanks anyway! Should what i have yet not work i'll check it out! Thanks for all your help, i appreciate that a lot! Dominik Monday, February 11, 2008 3:15 PM

d.j.t

20 Points

Hello d.j.t, Considering that many developers in this forum ask how to automate a web page via WebBrowser, rotate or flip images, my team has created a code sample for this frequently asked programming task in Microso All-In-One Code Framework. You can download the code samples at:

0 Sign in to vote

VBWebBrowserAutomation   http://bit.ly/VBWebBrowserAutomation   CSWebBrowserAutomation   http://bit.ly/CSWebBrowserAutomation  

With these code samples, we hope to reduce developers’ efforts in solving the frequently asked programming tasks. If you have any feedback or sugges ons for the code samples, please email us: onecode@microso .com. -----------The Microso All-In-One Code Framework (h p://1code.codeplex.com) is a free, centralized code sample library driven by developers' needs. Our goal is to provide typical code samples for all Microso development technologies, and reduce developers' efforts in solving typical programming tasks. Our team listens to developers’ pains in MSDN forums, social media and various developer communi es. We write code samples based on developers’ frequently asked programming tasks, and allow developers to download them with a short code sample publishing cycle. Addi onally, our team offers a free code sample request service. This service is a proac ve way for our developer community to obtain code samples for certain programming tasks directly from Microso . Thanks Microso All-In-One Code Framework

Thursday, March 24, 2011 10:22 AM

All-In-One Code Framework by Microsoft Microsoft All-In-One Cod...

Help us improve MSDN.

Dev centers

Make a suggestion

Learning resources

Community

Support

Windows

Microsoft Virtual Academy

Forums

Self support

Channel 9

Blogs

Office

MSDN Magazine

Codeplex

Visual Studio Microsoft Azure

Programs BizSpark (for startups) Microsoft Imagine (for students)

More...

United States (English)

Newsletter

Privacy & cookies

Terms of use

Trademarks

© 2019 Microsoft

65 Points

Related Documents

Browser
June 2020 15
Browser
May 2020 22
Browser
October 2019 24
Vocabulary Browser
June 2020 3
Browser Automation.pdf
December 2019 14
Web Browser
November 2019 23

More Documents from ""