Jump to content


 


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Photo

search multiple documents for email addresses


  • Please log in to reply
12 replies to this topic

#1 Blufx

Blufx

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 18 June 2014 - 11:55 AM

I have a customer that has a folder with several hundred word documents in .RTF format that have his business contacts information inside. I need to find how he can extract the email addresses from the documents without having to open each one. Is this possible? Or can these files be converted to some kind of contacts file in order to do this. He needs to send a flyer out and it would take countless hours to open all these(around 800) and get the addresses.


 


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


BC AdBot (Login to Remove)

 


#2 joshmartin

joshmartin

  • Members
  • 27 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 18 June 2014 - 12:30 PM

See this link: http://support.microsoft.com/kb/2665750 

 

This will merge all of the word documents into one, then he can open one document and copy/paste all the address from there or run his script etc.



#3 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 18 June 2014 - 02:29 PM

@ joshmartin,

Thanks for the link. I was anxious to use it but the link's no good. Can you give another or a program to search for?


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#4 JohnC_21

JohnC_21

  • Members
  • 24,848 posts
  • ONLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 18 June 2014 - 10:03 PM

I found this utility. Do not click the green download button. Click the Direct Download link. I tested it on 4 rtf files with random email addresses and it pulled out all of the email addresses. It worked on an XP computer but I have on tested it on Windows 7.

 

http://download.cnet.com/Email-Extractor/3000-2650_4-10527464.html



#5 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 19 June 2014 - 07:53 AM

I tried it. It would run for 4 seconds, gather 50 addresses, then freeze. I did find another called easy email extractor that does the same thing. I'm hoping it works. It's been running 4 hours now at 50% CPU load. I'd thought it would be done by now.


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#6 JohnC_21

JohnC_21

  • Members
  • 24,848 posts
  • ONLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 19 June 2014 - 08:28 AM

There is another method that uses a PowerShell script but I haven't tested it on multiple files.

#7 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 19 June 2014 - 08:30 AM

I know even less about scripts and macros. I'm hoping the program works.


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#8 JohnC_21

JohnC_21

  • Members
  • 24,848 posts
  • ONLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 19 June 2014 - 08:50 AM

Hopefully it will work for you.

 

The script is not actually that bad. I found it here. I believe you would need to change the first line to *.rtf. You would need to create a PS folder on the root of C:\ with all of the files.

 

$input_path = ‘c:\ps\*.rtf’
$output_file = ‘c:\ps\extracted_addresses.txt’
$regex = ‘\b[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b’
select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

 

Copy and past this into Notepad and give it a name with a ps1 extension. Then run the script.

 

Type powershell in the search box and run it as admin.

At the prompt type:

 

Set-ExecutionPolicy RemoteSigned   <enter>

 

Then:

 

PS C:\> C:\PS\Myscript.ps1

 

I have not confirmed if this would work yet on multiple files. That first line is the thing I am not sure about. If using *.rtf would make the script search all of the files. I am going to try it when I get some time.

 

Edit: Are there a lot of email addresses in a single rtf file?


Edited by JohnC_21, 19 June 2014 - 08:53 AM.


#9 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 19 June 2014 - 10:14 AM

Most of the files have only one, maybe two addresses.


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#10 JohnC_21

JohnC_21

  • Members
  • 24,848 posts
  • ONLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 19 June 2014 - 10:38 AM

Maybe it would be better to split up the 800 files but it's kind of a kludge fix.



#11 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 19 June 2014 - 10:48 AM

For now, I'm still hoping this piece of software does the trick. It's been running for hours doing something, but no progress bar or percentage done. Just have to wait it out.


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#12 Blufx

Blufx
  • Topic Starter

  • Members
  • 70 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Upstate South Carolina
  • Local time:07:56 PM

Posted 22 June 2014 - 10:31 AM

 It worked perfectly with two exceptions; there is no % done or progress bar, and it doesn't stop when it's finished. It just says "wait. I waited 6 hours, checking task manager, it was using a steady 48-52% of the CPU. I finally took a chance and stopped it. It was done. I don't know how long it had been done, but I'm betting it probably finished in the first hour. In 683 documents, it found 748 email addresses. name of the app is  Easy Email Extractor (eex.exe)

 

Marked this one as solved


Looking back....I should have been a lot more specific when I said I wanted to be somebody.albertavatar.jpg


#13 JohnC_21

JohnC_21

  • Members
  • 24,848 posts
  • ONLINE
  •  
  • Gender:Male
  • Local time:06:56 PM

Posted 22 June 2014 - 10:43 AM

Thanks for the update. People who have need an email extractor will be helped by your feedback.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users