I am looking for a regular expression that will... Thread poster: Michael Beijer
| Michael Beijer United Kingdom Local time: 15:17 Member (2009) Dutch to English + ...
...find every line starting with a number of words enclosed in parentheses and followed by a number of words. In the example below, I would like to find only the last line. (word word) (word word word word) (word word) (word word word) (word) (word word word word) (word word word) word
Any suggestions are more than welcome! Michael PS: I'm not sure if this is relevant, but my text contains empty lines. Incidentally, is there any program that can come up with the regular expression for me automatically if I feed it a sample input and output?
[Edited at 2012-07-22 02:27 GMT] | | | I'm not sure I see the difficulty... | Jul 22, 2012 |
Do you need to select the whole line? Or just find it? If it's the later, a simple \)\s\w would work (finds the end parenthesis, a whitespace character -space, tab etc.- and an alphanumeric character). But this seems too simple, perhaps you have a more complex format? If you need to select the whole line, \(.*\)\s\b.*\b works but its too tailored to your example... (finds anything between parenthesis followed by a whitespace followed by "whole" words). Hope... See more Do you need to select the whole line? Or just find it? If it's the later, a simple \)\s\w would work (finds the end parenthesis, a whitespace character -space, tab etc.- and an alphanumeric character). But this seems too simple, perhaps you have a more complex format? If you need to select the whole line, \(.*\)\s\b.*\b works but its too tailored to your example... (finds anything between parenthesis followed by a whitespace followed by "whole" words). Hope it helps!
[Edited at 2012-07-22 03:20 GMT] ▲ Collapse | | | Michael Beijer United Kingdom Local time: 15:17 Member (2009) Dutch to English + ... TOPIC STARTER Hello Rossana, | Jul 22, 2012 |
Your second expression \(.*\)\s\b.*\b
does the trick! It's a little hard to explain, but the reason I need to do this is that I am working on a very large glossary in CSV format (with 5 columns: Dutch, Dutch Definition, English, second English, third English term), and I need to isolate only the lines that have something like '(word word word) word' on them, because these need to be fixed. In a previous step, I used parentheses in a Find and Replace operation (it's a long story...), and in doing so inadvertently messed up sth that now needs to be fixed. I hope this makes at least SOME sense. Anyway, I had been scrolling down through the whole thing manually, trying to spot all of these lines visually, but with your regex I can now just find them all in one fell swoop and fix them, which will save me a LOT of time. Thanks! Michael | | | Glad it helped! | Jul 22, 2012 |
Remember that regex serches can preserve the match, so if you need to "move" that parenthesis within the line it can also be automated Best of lucks with that glossary! | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » I am looking for a regular expression that will... Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |