PHP Word Wrap
View Source (4KB)
PHP Word Wrap is a simple PHP routine to reformat text to a predefined column width. Sometimes this is called word wrap. I use the routine in my web email script, so my emails arrive correctly formatted. Basically it turns this:
I think your PHP Word Wrap script function is really really cool!
into this:
I think your PHP Word Wrap script function is really really cool!
You can use this code easily without needing to understand it. To reformat some text, you would call it as follows:
$str = wordwrap($str, $width);
where $str
is the string you want to reformat, and
$width
is an integer number that represents the column
width you want to format the text to. That's it.
Easy, isn't it?
Undoubtedly some of you will want to get your feet wet and try to understand the code. If so, read on and I'll try to explain what is happening amongst those complex PHP codes.
To reformat the text correctly, this is what we need to do:
- Get rid of any unnecessary spaces and newlines.
- Split the string into a series of paragraphs that we will reformat individually.
- Tokenize each paragraph (that is, split it up into a series of words) and construct lines of tokens that are shorter than our chosen column width limit.
- Join all the reformatted paragraphs together into one long string again.
The way I've chosen to implement this involves two functions.
wordwrapLine
formats an individual paragraph, while
wordwrap
refines the string into
paragraphs that are ready to be modified by wordwrapLine
.
wordwrapLine
is a fairly easy function to write. All
you really need to do is this:
- Tokenize the string (ie extract a word at a time).
- Keep adding tokens / words to a line until it nearly goes over the word wrap limit.
- When a line is finished, begin another.
- Keep doing this until we run out of words / tokens.
The one catch you will find in wordwrapLine
is this line:
if (strlen($line) + strlen($tok) < ($l + 2) )
Where does the "+ 2" come from? Well, at first I was confused by this, but Scott @ Shadow Technologies has the explanation! "When you strip the spaces out of the tokens you are losing 2 characters... the space at the beginning and the end of the token string. I believe this is why you need the +2."
- Refine the string in such a way that we can send parts of it
as input to
wordwrapLine
. - Split the string into an array of paragraphs.
- Step through the array and perform
wordwrapLine
on each paragraph. - Join the paragraphs together, trim the string and we're done.
The wordwrap
function begins by performing a whole stack of
icky regular expressions on $str
:
$str = ereg_replace("([^\r\n])\r\n([^\r\n])", "\\1 \\2", $str);
$str = ereg_replace("[\r\n]*\r\n[\r\n]*", "\r\n\r\n", $str);
$str = ereg_replace("[ ]* [ ]*", ' ', $str);
$str = StripSlashes($str);
These expressions work together to clean up the text so that it can be
processed correctly by the wordwrapLine
function. The first
expression changes any single newlines (that is, one that isn't preceded
or followed by a newline) into a space. The second expression finds
sequences of more than one newline and replaces them with a single newline.
The third expression replaces any sequences of more than one blank
character by a single character. StripSlashes
removes any
unnecessary slashes that PHP may have introduced to the string.
Update: This page was written some years ago, and word wrap is now built-in to PHP since version 4 (and we're already up to version 5 now). In most cases, it's better to use the built-in version. But this page explains the old code I wrote and used in the days of PHP 3.
That's it. Hopefully this helped you learn something new about PHP. If it did help you and you're feeling generous, why not buy one of my shareware Photoshop plugins (because the funds help keep this site online and it will bring me joy ;)
Happy PHP scripting!