Miscellaneous interesting String methods:TopFinding LInks

Finding LInks

Demo. Link finder:

We want to write a program to find and extract all of the links in an html file. See FindLinks.

To do this, we need to know how a link is defined in an HTML file:

    <a href="the URL">link </a>

So we need to find the tags "<a>" and "</a>" that surround the URL.

Convert the string that is the HTML file to lowercase.

   String links = "";

Find the first position of "<a"

While there is a link remaining (i.e., position is not -1)

        // Extract all the links from a web page
  private String findLinks( String fullpage )
  {                     
                        
      int tagPos,               // Start of <A tag specification
          tagEnd;               // Position of first ">" after tag start
                
                // A lower case only version of the page for searching
      String lowerpage = fullpage.toLowerCase();
      
                // Text of one A tag
      String tag;
      
                // The A tags found so far
      String links = ""; 
      
                // Paste stuff on end of page to ensure searches succeed
      fullpage = fullpage + " >";
  
      tagPos = lowerpage.indexOf("<a ",0);
    
      while (tagPos >= 0 )
      {
         tagEnd = fullpage.indexOf(">",tagPos+1);
                
         tag = fullpage.substring(tagPos, tagEnd+1);
                
         links = links + tag + "\n";    
                
         tagPos = lowerpage.indexOf("<a ", tagEnd);             
      }         
        
      return links;
  }

Miscellaneous interesting String methods:TopFinding LInks