Home All Groups Group Topic Archive Search About
Author
3 Apr 2007 5:17 PM
Yobbo
Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on a
certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) > 0 AND NOT IsNull(s) THEN
  s = Replace(s,"™","™")
  s = Replace(s,"—","-")
  s = Replace(s,"’",""")
  s = Replace(s,"'",""")
  s = Replace(s,"""",""")
  s = Replace(s,"&","&")
  s = Replace(s,"<","&lt;")
  s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need be?

Thanks

Author
4 Apr 2007 7:09 AM
Adrienne Boswell
Show quote Hide quote
Gazing into my crystal ball I observed "Yobbo" <info@NoSpamIt.com>
writing in news:ugCfLVhdHHA.2148@TK2MSFTNGP05.phx.gbl:

> Hi All
>
> I have an ASP function in place to strip invalid chars out of a data
> store before I create an XML file of this data, but my function
> doesn't work on a certain set of chars.
>
> As far as I can see these are the following:
>
> a) trademark char
> b) long hyphen/dash char
> c) smart/curly quotes (both left and right)

I detest these "smart" quotes.  Are regular quotes dumb by comparison?

Show quoteHide quote
>
> Even though my function is set up as follows:
>
> Function ReFormatStringForXML(s)
>  IF LEN(s) > 0 AND NOT IsNull(s) THEN
>   s = Replace(s,"&#8482;","&trade;")
>   s = Replace(s,"&#8212;","-")
>   s = Replace(s,"&#8217;","&quot;")
>   s = Replace(s,"'","&quot;")
>   s = Replace(s,"""","&quot;")
>   s = Replace(s,"&","&amp;")
>   s = Replace(s,"<","&lt;")
>   s = Replace(s,">","&gt;")
>  END IF
>  ReFormatStringForXML = s
> End Function
>
> These chars still pass by and foul up my XML file.
>
> I have a feeling that its down to the fact that my function is looking
> for the html equiv rather than the actual char, but I can't possibly
> get away with simply copy and pasting these friggin(!!) chars into my
> function. Surely this is bad practise?

You are putting in the HTML entity, you may need to put the ascii
character instead, for example:
s = replace(s,chr(60),"&gt;")

>
> Does anybody know how I can trap and replace/remove these chars if
> need be?
>
> Thanks
>
>
>
>

HTH

--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share
Are all your drivers up to date? click for free checkup

Author
4 Apr 2007 4:02 PM
Daniel Crichton
Yobbo wrote  on Tue, 3 Apr 2007 18:17:59 +0100:

Show quoteHide quote
> Hi All
>
> I have an ASP function in place to strip invalid chars out of a data store
> before I create an XML file of this data, but my function doesn't work on
> a certain set of chars.
>
> As far as I can see these are the following:
>
> a) trademark char
> b) long hyphen/dash char
> c) smart/curly quotes (both left and right)
>
> Even though my function is set up as follows:
>
> Function ReFormatStringForXML(s)
>  IF LEN(s) > 0 AND NOT IsNull(s) THEN
>   s = Replace(s,"&#8482;","&trade;")
>   s = Replace(s,"&#8212;","-")
>   s = Replace(s,"&#8217;","&quot;")
>   s = Replace(s,"'","&quot;")
>   s = Replace(s,"""","&quot;")
>   s = Replace(s,"&","&amp;")
>   s = Replace(s,"<","&lt;")
>   s = Replace(s,">","&gt;")
>  END IF
>  ReFormatStringForXML = s
> End Function
>
> These chars still pass by and foul up my XML file.
>
> I have a feeling that its down to the fact that my function is looking for
> the html equiv rather than the actual char, but I can't possibly get away
> with simply copy and pasting these friggin(!!) chars into my function.
> Surely this is bad practise?
>
> Does anybody know how I can trap and replace/remove these chars if need
> be?

Your function is quite limited. What happens when a character not in your
list appears? The XML supported entity list is pretty small.

Here's the function I use in my own XML generation code, it's crude but it works:

function XMLEncode(strText)

'loop through code and replace all non-alphanumeric characters with their
ascii value
strNewText = ""

For i = 1 to Len(strText)

  j = Asc(Mid(strText,i,1))

  If j = 10 Then
    'replace tab with a line break
   strNewText= strNewText & "&lt;br&gt;"
  ElseIf j = 13 or j = 9 then 'cr, lf, tab
    'strip them
  ElseIf j = 34 then
   strNewText = strNewText & "&quot;"
  ElseIf j = 39 then
   strNewText = strNewText & "&apos;"
  ElseIf j = 32 or j = 45 or (j >=49 and j <= 57) or (j >=65 and j <= 90) or
(j >= 97 and j <= 122) then
    'ok
    strNewText = strNewText & Mid(strText,i,1)
  ElseIf j = 38 Then '&
    strNewText = strNewText & "&amp;"
  ElseIf j = 60 then '<
    strNewText = strNewText & "&lt;"
  ElseIf j = 62 then '>
    strNewText = strNewText & "&gt;"
  Else
    strNewText = strNewText & "&#" & j & ";"
  End If

Next

XMLEncode = strNewText
End Function


This checks each character in the string in turn, and replaces some with
entities, and the rest of the non-printable characters with their numeric
value. You could easily add a few more entity replacements as required. Just
watch out for the first couple of replacements where I replace tabs with a
<br>, and strip out carriage returns and line feeds, as that might not fit
what you want do with the XML yourself.

Dan
Author
5 Apr 2007 7:35 AM
Anthony Jones
Show quote Hide quote
"Yobbo" <info@NoSpamIt.com> wrote in message
news:ugCfLVhdHHA.2148@TK2MSFTNGP05.phx.gbl...
> Hi All
>
> I have an ASP function in place to strip invalid chars out of a data store
> before I create an XML file of this data, but my function doesn't work on
a
> certain set of chars.
>
> As far as I can see these are the following:
>
> a) trademark char
> b) long hyphen/dash char
> c) smart/curly quotes (both left and right)
>
> Even though my function is set up as follows:
>
> Function ReFormatStringForXML(s)
>  IF LEN(s) > 0 AND NOT IsNull(s) THEN
>   s = Replace(s,"&#8482;","&trade;")
>   s = Replace(s,"&#8212;","-")
>   s = Replace(s,"&#8217;","&quot;")
>   s = Replace(s,"'","&quot;")
>   s = Replace(s,"""","&quot;")
>   s = Replace(s,"&","&amp;")
>   s = Replace(s,"<","&lt;")
>   s = Replace(s,">","&gt;")
>  END IF
>  ReFormatStringForXML = s
> End Function
>
> These chars still pass by and foul up my XML file.
>
> I have a feeling that its down to the fact that my function is looking for
> the html equiv rather than the actual char, but I can't possibly get away
> with simply copy and pasting these friggin(!!) chars into my function.
> Surely this is bad practise?
>
> Does anybody know how I can trap and replace/remove these chars if need
be?
>
> Thanks

If you are creating an XML file can you use a DOMDocument to build it and
save it?
That'll ensure correct XML is created.

Bookmark and Share

Post Thread options