Jump to content


 


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Photo

Regex Help


  • Please log in to reply
9 replies to this topic

#1 aommaster

aommaster

    I !<3 malware


  • Malware Response Team
  • 5,294 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Dubai
  • Local time:10:01 PM

Posted 24 June 2009 - 08:09 AM

Hi guys!

I currently have the following code in VB .Net:
Dim matchcollection As System.Text.RegularExpressions.MatchCollection
Dim currentmatch As System.Text.RegularExpressions.Match
Dim lowercasetag As String
matchcollection = Regex.Matches(lines, "\[(?i)(.+?)\]")
	For Each currentmatch In matchcollection
		lowercasetag = currentmatch.ToString.ToLower
		lines = Regex.Replace(lines, "\[(?i)(.+?)\]", "[" & lowercasetag & "]")
	Next

The variable lines comes in as a string. The Regex is supposed to detect BBCode tags regardless of case (which is does perfectly) and then drop the case down to lower case.

However, it seems that the lowercasetag variable cannot be understood by regex. I was wondering whether I was doing something wrong, or if there were an alternative way of doing this.

Thanks again!

Edited by aommaster, 24 June 2009 - 08:10 AM.

My website: http://aommaster.com
unite_blue.png
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM


BC AdBot (Login to Remove)

 


#2 groovicus

groovicus

  • Security Colleague
  • 9,963 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Centerville, SD
  • Local time:12:01 PM

Posted 24 June 2009 - 11:40 AM

A regular expression is not a string.. it is a pattern.

http://www.astahost.com/info.php/Visual-Ba...ed33_t5171.html

#3 aommaster

aommaster

    I !<3 malware

  • Topic Starter

  • Malware Response Team
  • 5,294 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Dubai
  • Local time:10:01 PM

Posted 24 June 2009 - 11:55 AM

Hi!

Thanks for your reply. I understand the regex works by using patterns, but I thought perhaps I could put in a pre-defined variable into it. Apparently not :thumbsup:

So my next question is, when my RegEx goes through the lines it reads, it correctly finds all the BBCode tags. How do I get it to change them to lowercase? I can't use the replace function because it is case sensitive. Is there something Regex can do to automatically change them to lower case?

Thanks again!

My website: http://aommaster.com
unite_blue.png
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM


#4 groovicus

groovicus

  • Security Colleague
  • 9,963 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Centerville, SD
  • Local time:12:01 PM

Posted 24 June 2009 - 12:11 PM

How about simply finding the bb tags, which are going to be the first and last characters in the string, and just make everything inside lower case? Again, I don't know the specific code for VB, but it would be something like:
newStr = str.substring(1, len(str)-1).toLower()

The potential downside is that I don't know what will happen if the character is not a letter. In that case, you would have to loop through the string and check each character, and convert to lower case. The API for ToUpper and ToLower does not say what happens if the character is not a letter, so write a simple program that converts PL=#$ to lower case and see what happens.

#5 aommaster

aommaster

    I !<3 malware

  • Topic Starter

  • Malware Response Team
  • 5,294 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Dubai
  • Local time:10:01 PM

Posted 24 June 2009 - 12:27 PM

Hi!

Regarding this:

How about simply finding the bb tags, which are going to be the first and last characters in the string,

Not sure what you mean exactly by "frst and last characters in a string". I'll have to search for the opening square brackets for that to be the case. Consider bold text tags that lie in the middle of a sentance. The string doesn't start with the BBCode tag.

I was looking for a way of pulling the value stored in $1, etc. from the regex, converting it to lowercase, and then throwing it back in.

Your code here
newStr = str.substring(1, len(str)-1).toLower()
VB .Net has similar syntax, so I can see what you're gettign at :thumbsup: Here, I'll need to take some care, I think, in definig what the string "str" is. It can't be the whole line, it will need to be the BBCode line. But then the question comes up as to how to get those lines out, and again, same problem as above, how do I bring it down to lowercase?

Maybe I misunderstood you, please correct me if I did :flowers:

Thanks again for your time!

My website: http://aommaster.com
unite_blue.png
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM


#6 Billy O'Neal

Billy O'Neal

    Visual C++ STL Maintainer


  • Malware Response Team
  • 12,304 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Redmond, Washington
  • Local time:11:01 AM

Posted 24 June 2009 - 12:29 PM

Hello :thumbsup:

A better regex would be
\[[^\]]+]
it will execute faster.

You can use the regex to find the bbtags and tolower to do the tolowering... something like this (where theString is your input)

Dim theRegEx as System.Text.RegularExpressions.Regex("\[[^\]]+]")
For Each theMatch as System.Text.RegularExpressions.Match in theRegEx.Match(theString)

Dim lowerString as string = theString.Substring(theMatch.Index + 1, theMatch.Length - 2).ToLower();
theString.Remove(theMatch.Index + 1, theMatch.Length - 2)
theString.Insert(theMatch.Index + 1, lowerString)

End For

Hope that helps,
Billy3

EDIT:
There are probably some syntax errors in there... it's been a while since I've written VB

Edited by Billy O'Neal, 24 June 2009 - 12:33 PM.

Twitter - My statements do not establish the official position of Microsoft Corporation, and are my own personal opinion. (But you already knew that, right?)
Posted Image

#7 Billy O'Neal

Billy O'Neal

    Visual C++ STL Maintainer


  • Malware Response Team
  • 12,304 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Redmond, Washington
  • Local time:11:01 AM

Posted 24 June 2009 - 12:39 PM

The potential downside is that I don't know what will happen if the character is not a letter. In that case, you would have to loop through the string and check each character, and convert to lower case. The API for ToUpper and ToLower does not say what happens if the character is not a letter, so write a simple program that converts PL=#$ to lower case and see what happens.

It leaves characters for which there is no lowercase representation unchanged.

Billy3
Twitter - My statements do not establish the official position of Microsoft Corporation, and are my own personal opinion. (But you already knew that, right?)
Posted Image

#8 aommaster

aommaster

    I !<3 malware

  • Topic Starter

  • Malware Response Team
  • 5,294 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Dubai
  • Local time:10:01 PM

Posted 24 June 2009 - 01:11 PM

Hi Billy!

This was the code I ended up using:
Dim theRegEx As New System.Text.RegularExpressions.Regex("\[[^\]]+]", System.Text.RegularExpressions.RegexOptions.IgnoreCase)
For Each theMatch As System.Text.RegularExpressions.Match In theRegEx.Matches(lines)
	   Dim lowerString As String = lines.Substring(theMatch.Index + 1, theMatch.Length - 2).ToLower()
	   lines = lines.Remove(theMatch.Index + 1, theMatch.Length - 2)
	   lines = lines.Insert(theMatch.Index + 1, lowerString)
Next

Needless to say, you got it pretty close :thumbsup:

Thanks again Billy!

My website: http://aommaster.com
unite_blue.png
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM


#9 Billy O'Neal

Billy O'Neal

    Visual C++ STL Maintainer


  • Malware Response Team
  • 12,304 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Redmond, Washington
  • Local time:11:01 AM

Posted 24 June 2009 - 01:23 PM

You're welcome :thumbsup:

You can remove the "System.Text.RegularExpressions.RegexOptions.IgnoreCase" part... [ and ] characters are the same upper and lower case ;)

Billy3
Twitter - My statements do not establish the official position of Microsoft Corporation, and are my own personal opinion. (But you already knew that, right?)
Posted Image

#10 aommaster

aommaster

    I !<3 malware

  • Topic Starter

  • Malware Response Team
  • 5,294 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Dubai
  • Local time:10:01 PM

Posted 24 June 2009 - 01:27 PM

Ahh right, you are! I was trying to find out why VB net just yelled at me so badly, so I threw that in just to make sure case sensitivity wasn't a problem :thumbsup:

Thanks again!

My website: http://aommaster.com
unite_blue.png
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users