BleepingComputer.com: Regex Help

Jump to content


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.

Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

Regex Help

#1 User is offline   aommaster 

  • I !<3 malware
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Team
  • Posts: 5,171
  • Joined: 08-June 08
  • Gender:Male
  • Location:Dubai

Posted 24 June 2009 - 08:09 AM

Hi guys!

I currently have the following code in VB .Net:
Dim matchcollection As System.Text.RegularExpressions.MatchCollection
Dim currentmatch As System.Text.RegularExpressions.Match
Dim lowercasetag As String
matchcollection = Regex.Matches(lines, "\[(?i)(.+?)\]")
	For Each currentmatch In matchcollection
		lowercasetag = currentmatch.ToString.ToLower
		lines = Regex.Replace(lines, "\[(?i)(.+?)\]", "[" & lowercasetag & "]")
	Next


The variable lines comes in as a string. The Regex is supposed to detect BBCode tags regardless of case (which is does perfectly) and then drop the case down to lower case.

However, it seems that the lowercasetag variable cannot be understood by regex. I was wondering whether I was doing something wrong, or if there were an alternative way of doing this.

Thanks again!

This post has been edited by aommaster: 24 June 2009 - 08:10 AM

My website: http://www.aommaster.com
Posted Image
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM

All my help is free. However, if you would like to make a donation, then please click here.



#2 User is offline   groovicus 

  • Hail Groovicus!
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Moderator
  • Posts: 9,522
  • Joined: 05-June 04
  • Gender:Male
  • Location:Centerville, SD

Posted 24 June 2009 - 11:40 AM

A regular expression is not a string.. it is a pattern.

http://www.astahost.com/info.php/Visual-Ba...ed33_t5171.html
"Take the risk of thinking for yourself, much more happiness, truth, beauty, and wisdom will come to you that way" - Christopher Hitchens

#3 User is offline   aommaster 

  • I !<3 malware
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Team
  • Posts: 5,171
  • Joined: 08-June 08
  • Gender:Male
  • Location:Dubai

Posted 24 June 2009 - 11:55 AM

Hi!

Thanks for your reply. I understand the regex works by using patterns, but I thought perhaps I could put in a pre-defined variable into it. Apparently not :thumbsup:

So my next question is, when my RegEx goes through the lines it reads, it correctly finds all the BBCode tags. How do I get it to change them to lowercase? I can't use the replace function because it is case sensitive. Is there something Regex can do to automatically change them to lower case?

Thanks again!
My website: http://www.aommaster.com
Posted Image
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM

All my help is free. However, if you would like to make a donation, then please click here.



#4 User is offline   groovicus 

  • Hail Groovicus!
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Moderator
  • Posts: 9,522
  • Joined: 05-June 04
  • Gender:Male
  • Location:Centerville, SD

Posted 24 June 2009 - 12:11 PM

How about simply finding the bb tags, which are going to be the first and last characters in the string, and just make everything inside lower case? Again, I don't know the specific code for VB, but it would be something like:
newStr = str.substring(1, len(str)-1).toLower()

The potential downside is that I don't know what will happen if the character is not a letter. In that case, you would have to loop through the string and check each character, and convert to lower case. The API for ToUpper and ToLower does not say what happens if the character is not a letter, so write a simple program that converts PL=#$ to lower case and see what happens.
"Take the risk of thinking for yourself, much more happiness, truth, beauty, and wisdom will come to you that way" - Christopher Hitchens

#5 User is offline   aommaster 

  • I !<3 malware
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Team
  • Posts: 5,171
  • Joined: 08-June 08
  • Gender:Male
  • Location:Dubai

Posted 24 June 2009 - 12:27 PM

Hi!

Regarding this:

Quote

How about simply finding the bb tags, which are going to be the first and last characters in the string,

Not sure what you mean exactly by "frst and last characters in a string". I'll have to search for the opening square brackets for that to be the case. Consider bold text tags that lie in the middle of a sentance. The string doesn't start with the BBCode tag.

I was looking for a way of pulling the value stored in $1, etc. from the regex, converting it to lowercase, and then throwing it back in.

Your code here
newStr = str.substring(1, len(str)-1).toLower()

VB .Net has similar syntax, so I can see what you're gettign at :thumbsup: Here, I'll need to take some care, I think, in definig what the string "str" is. It can't be the whole line, it will need to be the BBCode line. But then the question comes up as to how to get those lines out, and again, same problem as above, how do I bring it down to lowercase?

Maybe I misunderstood you, please correct me if I did :flowers:

Thanks again for your time!
My website: http://www.aommaster.com
Posted Image
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM

All my help is free. However, if you would like to make a donation, then please click here.



#6 User is offline   Billy O'Neal 

  • Bleepin Engineer
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Instructor
  • Posts: 10,079
  • Joined: 17-January 08
  • Gender:Male
  • Location:Cleveland, Ohio

Posted 24 June 2009 - 12:29 PM

Hello :thumbsup:

A better regex would be
\[[^\]]+]
it will execute faster.

You can use the regex to find the bbtags and tolower to do the tolowering... something like this (where theString is your input)

Dim theRegEx as System.Text.RegularExpressions.Regex("\[[^\]]+]")
For Each theMatch as System.Text.RegularExpressions.Match in theRegEx.Match(theString)

Dim lowerString as string = theString.Substring(theMatch.Index + 1, theMatch.Length - 2).ToLower();
theString.Remove(theMatch.Index + 1, theMatch.Length - 2)
theString.Insert(theMatch.Index + 1, lowerString)

End For

Hope that helps,
Billy3

EDIT:
There are probably some syntax errors in there... it's been a while since I've written VB

This post has been edited by Billy O'Neal: 24 June 2009 - 12:33 PM


#7 User is offline   Billy O'Neal 

  • Bleepin Engineer
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Instructor
  • Posts: 10,079
  • Joined: 17-January 08
  • Gender:Male
  • Location:Cleveland, Ohio

Posted 24 June 2009 - 12:39 PM

View Postgroovicus, on Jun 24 2009, 01:11 PM, said:

The potential downside is that I don't know what will happen if the character is not a letter. In that case, you would have to loop through the string and check each character, and convert to lower case. The API for ToUpper and ToLower does not say what happens if the character is not a letter, so write a simple program that converts PL=#$ to lower case and see what happens.

It leaves characters for which there is no lowercase representation unchanged.

Billy3

#8 User is offline   aommaster 

  • I !<3 malware
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Team
  • Posts: 5,171
  • Joined: 08-June 08
  • Gender:Male
  • Location:Dubai

Posted 24 June 2009 - 01:11 PM

Hi Billy!

This was the code I ended up using:
Dim theRegEx As New System.Text.RegularExpressions.Regex("\[[^\]]+]", System.Text.RegularExpressions.RegexOptions.IgnoreCase)
For Each theMatch As System.Text.RegularExpressions.Match In theRegEx.Matches(lines)
	   Dim lowerString As String = lines.Substring(theMatch.Index + 1, theMatch.Length - 2).ToLower()
	   lines = lines.Remove(theMatch.Index + 1, theMatch.Length - 2)
	   lines = lines.Insert(theMatch.Index + 1, lowerString)
Next


Needless to say, you got it pretty close :thumbsup:

Thanks again Billy!
My website: http://www.aommaster.com
Posted Image
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM

All my help is free. However, if you would like to make a donation, then please click here.



#9 User is offline   Billy O'Neal 

  • Bleepin Engineer
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Instructor
  • Posts: 10,079
  • Joined: 17-January 08
  • Gender:Male
  • Location:Cleveland, Ohio

Posted 24 June 2009 - 01:23 PM

You're welcome :thumbsup:

You can remove the "System.Text.RegularExpressions.RegexOptions.IgnoreCase" part... [ and ] characters are the same upper and lower case ;)

Billy3

#10 User is offline   aommaster 

  • I !<3 malware
  • PipPipPipPipPipPip
  • Find Topics
  • Group: Malware Response Team
  • Posts: 5,171
  • Joined: 08-June 08
  • Gender:Male
  • Location:Dubai

Posted 24 June 2009 - 01:27 PM

Ahh right, you are! I was trying to find out why VB net just yelled at me so badly, so I threw that in just to make sure case sensitivity wasn't a problem :thumbsup:

Thanks again!
My website: http://www.aommaster.com
Posted Image
Please do not send me PM's requesting for help. The forums are there for a reason : )
If I am helping you and do not respond to your thread for 48 hours, please send me a PM

All my help is free. However, if you would like to make a donation, then please click here.



Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users