Thursday, December 19, 2013

Double character to single with regex

How to unescape string, that is escaped doubling some character?

For example, CSV files format doubles double quotes, and wraps entire value with double quotes, so the string

My "fat" string!

becomes

"My ""fat"" string!"

To unescape that, I have to replace doubled double quotes with single double quotes, and remove single double quotes.

" ->
"" -> "

This is regex I need to use:
/"("{0,1})/
and replace with content of capturing group.

Example code for PHP:


// Ansi 
$s = preg_replace('/"("{0,1})/', '$1', '"My ""fat"" string!"');
 
// Unicode 
mb_regex_encoding('UTF-8');  
$s = mb_ereg_replace('/"("{0,1})/', '\\1', '"My ""fat"" string!"');