Delphi-PRAXiS

Delphi-PRAXiS (https://www.delphipraxis.net/forum.php)
-   Netzwerke (https://www.delphipraxis.net/14-netzwerke/)
-   -   Delphi Saving Document XML Source from TWebbrowser (https://www.delphipraxis.net/78700-saving-document-xml-source-twebbrowser.html)

Delphi-Lover 9. Okt 2006 16:59


Saving Document XML Source from TWebbrowser
 
Hello,

For saving the internetpage loaded in the TWebbrowser-component are two properties.
like : Document.Body.InnerHTML and Document.Body.InnerText (and the two OuterHTML, OuterText)

The problem with the InnerHTML property is that the browser already rendered the page received, so it is including all kinds of extra HTML tags:

Original XML from the webserver:

<?xml version="1.0" encoding="UTF-8"?>
<Delphi exampe-version="1.0">&gt;
<tag>
<text>Delphi is great</text>
....

The InnerHTML will give:

<DIV class=e><SPAN class=b></SPAN> <SPAN class=m>&lt;?</SPAN><SPAN class=pi>xml version="1.0" encoding="UTF-8" </SPAN><SPAN class=m>?&gt;</SPAN> </DIV>
<DIV class=e>......

The problem with the InnerText is that it will include "-" symbols, like the Internet Explorer does if it loads a XML file. (You can click on these "-" to open and close the nodes of the XML)

So the InnerText gives:

<?xml version="1.0" encoding="UTF-8" ?>
- <Delphi exampe-version="1.0">
>
- <tag>

As you see it corrupts also the XML source file.

Question: Anyone knows how to save the real XML file received by the Webbrowser?
If you use the "view source" option in the browser the XML-file will open OK without any formatting by the browser.
How can you simulate this option and save it to file??

Greets,

Delphi-Lover.

shmia 9. Okt 2006 17:10

Re: Saving Document XML Source from TWebbrowser
 
Try this:
Delphi-Quellcode:
function Document_GetHTML(Document: IDispatch):string;
var
   ms: TStringStream;
begin
   Result := '';
   if Assigned(Document) then
   begin
      ms := TStringStream.Create(Result);
      try
         OleCheck((Document as IPersistStreamInit).Save(TStreamAdapter.Create(ms),False));
         Result := ms.DataString;
      finally
         ms.Free;
      end;
   end;
end;

ShowMessage(Document_GetHTML(Webbrowser1.Document));

Delphi-Lover 10. Okt 2006 09:05

Re: Saving Document XML Source from TWebbrowser
 
Hello,

Thanks, but it doesn't work. The result are some unknown characters. Seems pointer info to me..
I'll keep trying. Any other suggestions???

Greetings,

Delphi-Lover.

There the code:

Delphi-Quellcode:
procedure TfrmReceiveXML.GetXMLFromWeb;

   function Document_GetHTML(Document: IDispatch):string;
   var
     ms: TStringStream;
   begin
     Result := '';
     if Assigned(Document) then
     begin
        ms := TStringStream.Create(Result);
        try
           OleCheck((Document as IPersistStreamInit).Save(TStreamAdapter.Create(ms),False));
           Result := ms.DataString;
        finally
           ms.Free;
        end;
     end;
  end;

var
  IEApp, varXMLReturn: OLEVariant;
  HeaderStr, EncodedStr, strXMLReturn : string;
  Post: OleVariant;
  nIdx: Integer;
begin
  HeaderStr:='Content-Type: application/x-www-form-urlencoded' + #10#13;
  EncodedStr:='vl_xml='+HTTPEncode(MyMsg);
  Post:=VarArrayCreate([0,Length(EncodedStr)-1],varByte);
  for nIdx:=1 to Length(EncodedStr) do Post[nIdx-1]:=Ord(EncodedStr[nIdx]);

  IEApp:=CreateOLEObject('InternetExplorer.Application');
  IEApp.Navigate('https://website.com/getdata.asp','','',Post, HeaderStr);

  While (IEApp.ReadyState<>4) or (IEApp.Busy) do
  begin
    Application.ProcessMessages;
  end;

  strXMLReturn:=Document_GetHTML(IEApp.Document);
  Memo1.Text:=strXMLReturn;
  Memo1.Lines.SaveToFile('ReturnXML.xml');

  { This will work, but with the problem }
  //varXMLReturn:=IEApp.document.body.InnerText; {or InnerHTML}
  //strXMLReturn:=varXMLReturn;
  //Memo1.Text:=strXMLReturn;
  //Memo1.Lines.SaveToFile('ReturnXML.xml');
  }

end;

Delphi-Lover 10. Okt 2006 13:17

Re: Saving Document XML Source from TWebbrowser
 
Hello,

On the web I've searched and searched, but my question seems a difficult problem. I found on a newsgroup someone with a similar problem.

View Source for WebBrowser control from C# program

Only the final solution the poster adds to the group as an attachment I can not find.
(The page above points not to the original newsgroup but to one of these :wall: newsgroup-collectors-sites)

But maybe it helps to understand the problem better.


here is the working code as how shmia wanted to do it.
(It works, but not solve the problem...)

Delphi-Quellcode:
var FileStream: TFileStream;

    procedure InternalSaveDocumentToStream(const Stream: TStream);
    var
      StreamAdapter: IStream;
      PersistStreamInit: IPersistStreamInit;
    begin
      Assert(Assigned(webbrowser.Document));
      if webbrowser.Document.QueryInterface(IPersistStreamInit, PersistStreamInit) = S_OK then
      begin
        StreamAdapter := TStreamAdapter.Create(Stream);
        PersistStreamInit.Save(StreamAdapter, False);
      end;
    end;

FileStream := TFileStream.Create('Return.xml', fmCreate);
try
  InternalSaveDocumentToStream(FileStream);
finally
  FileStream.Free;
end;
and here is code how to display the View-Source textbox.
(With the real source of the page, but it's an execute of a notepad session and I'm not able to save it in the delphi code...)

Delphi-Quellcode:

    procedure WBViewSourceDialog(AWebBrowser: TWebbrowser) ;
    const
      CGID_WebBrowser: TGUID = '{ED016940-BD5B-11cf-BA4E-00C04FD70816}';
      HTMLID_VIEWSOURCE = 2;

    var
      CmdTarget : IOleCommandTarget;
      vaIn, vaOut: OleVariant;
      PtrGUID: PGUID;
    begin
      New(PtrGUID) ;
      PtrGUID^ := CGID_WebBrowser;
      if AWebBrowser.Document <> nil then
        try
          AWebBrowser.Document.QueryInterface(IOleCommandTarget, CmdTarget) ;
          if CmdTarget <> nil then
          try
            CmdTarget.Exec(PtrGUID, HTMLID_VIEWSOURCE, 0, vaIn, vaOut) ;
          finally
            CmdTarget._Release;
          end;
        except
       end;
      Dispose(PtrGUID) ;
    end;

Greets,

Delphi-Lover.


Alle Zeitangaben in WEZ +1. Es ist jetzt 16:59 Uhr.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO © 2011, Crawlability, Inc.
Delphi-PRAXiS (c) 2002 - 2023 by Daniel R. Wolf, 2024-2025 by Thomas Breitkreuz