Extract words from a text

This is how you can extract words from a text, clean up a bit and store it in an array. The clean up function removes all characters that do not match; a to Z or A to Z or 0 to 9 or - (hyphen).

<?php
$text
= "This is the fake-text, from start!";

$RemoveChars  = array("([^a-zA-Z0-9-])"); //English characters
//$RemoveChars  = array("([^a-öA-Ö0-9-])"); //Swedish characters
$ReplaceWith = array(" ");
$text = preg_replace($RemoveChars, $ReplaceWith, $text);

$words = explode(" ", $text);
?>


Result:
Array
(
    [0] => This
    [1] => is
    [2] => the
    [3] => fake-text
    [5] => from
    [6] => start
)

If you are using Swedish characters and uppercase or lowercase characters you might need to take a look at Uppercase Swedish characters ÅÄÖ

Knowledge keywords: