|
użytkowników online: 47
|
OPINIE UŻYTKOWNIKÓW
|
W takich dniach, jak ten, nie żałuję, że wykupiłem abonament. Korzystam z porad na tych stronach nawet kilkanaście razy w tygodniu i dzięki nim prace nad stronami dla klientów idą mi o wiele szybciej, a strony wyglądają bardziej profesjonalnie. Nie wiem, jak mogłem wcześniej pracować bez dostępu do porad w tym serwisie!
Wojciech Miszkiewicz
|
|
PODRĘCZNIK PHP 5.x, 4.x, 3.x - częściowo spolszczony / źródło: www.php.net
[Spis]
[A]
[B]
[C]
[D]
[E]
[F]
[G]
[H]
[I]
[J]
[K]
[L]
[M]
[N]
[O]
[P]
[Q]
[R]
[S]
[T]
[U]
[V]
[X]
[W]
[Z]
soundex (PHP 3, PHP 4, PHP 5) soundex -- Calculate the soundex key of a string Descriptionstring soundex ( string str )
Calculates the soundex key of str.
Soundex keys have the property that words pronounced similarly
produce the same soundex key, and can thus be used to simplify
searches in databases where you know the pronunciation but not
the spelling. This soundex function returns a string 4 characters
long, starting with a letter.
This particular soundex function is one described by Donald Knuth
in "The Art Of Computer Programming, vol. 3: Sorting And
Searching", Addison-Wesley (1973), pp. 391-392.
Przykład 1. Soundex Examples |
<?php
soundex("Euler") == soundex("Ellery"); soundex("Gauss") == soundex("Ghosh"); soundex("Hilbert") == soundex("Heilbronn"); soundex("Knuth") == soundex("Kant"); soundex("Lloyd") == soundex("Ladd"); soundex("Lukasiewicz") == soundex("Lissajous"); ?>
|
|
See also
levenshtein(),
metaphone(), and
similar_text().
User Contributed Notes04-Oct-2005 09:25
Since the first letter is included in the phonetic representation in the output, it is worth pointing out that if you want a soundex key to work without the problems of klansy and clansy sounding different, take the substring from the first letter, as the first letter is the main constant of the word, and the numerical value is that of the phontic structure of the word.
crchafer-php at c2se dot com
13-Sep-2005 05:25
Rewritten, maybe -- but the algorithm has some obvious
optimisations which can be done, for example...
function text__soundex( $text ) {
$k = ' 123 12 22455 12623 1 2 2';
$nl = strlen( $tN = strtoupper( $text ) );
$p = trim( $k{ ord( $tS = $tN{0} ) - 65 } );
for( $n = 1; $n < $nl; ++$n )
if( ( $l = trim( $k{ ord( $tN{ $n } ) - 65 } ) ) != $p )
$tS .= ( $p = $l );
return substr( $tS . '000', 0, 4 );
}
// Notes:
// $k is the $key, essentially $SoundKey inverted
// $tN is the uppercase of the text to be optimised
// $tS is the partaully generated output
// $l is the current letter, $p the previous
// $n and $nl are iteration indicies
// 65 is ord('A'), precalculated for speed
// none ascii letters are not supported
// watch the brackets, quite a mixture here
(Code has suffered only basic tests, though it appears to
match the output of PHP's soundex(), speed untested --
though this should be /much/ faster than a4_perfect's
rewrite due to the removal of most loops and compares.)
C
2005-09-13
a4_perfect at mail dot ru
01-Aug-2005 03:18
Even be rewritten, function of [administrator at zinious dot com] is slower than soundex() for approx 30 times:
<?php
function MakeSoundEx($stringtomakesoundexof)
{
$temp_Name = strtoupper($stringtomakesoundexof);
$SoundKey = array(1=>"BPFV", "CSKGJQXZ", "DT", "L", "MN", "R", "AEHIOUWY");
$temp_Last = "";
$temp_Soundex = substr($temp_Name, 0, 1);
for ($x = 1; $x <= sizeof($SoundKey); $x++)
for ($i = 0; $i < strlen($SoundKey[$x]); $i++)
if ($temp_Soundex == substr($SoundKey[$x], $i - 1, 1))
$temp_Last = (string)($x==7?"":$x);
for ($n = 1; $n < strlen($temp_Name); $n++)
if (strlen($temp_Soundex) < 4)
{
for ($x = 1; $x <= sizeof($SoundKey); $x++)
for ($i = 0; $i < strlen($SoundKey[$x]); $i++)
if (substr($temp_Name, $n-1, 1)==substr($SoundKey[$x], $i-1, 1))
{
if($x<7 && $temp_Last!=(string)$x)
$temp_Soundex = $temp_Soundex.$x;
$temp_Last = (string)($x);
}
}
return $temp_Soundex . str_repeat("0", 4-strlen($temp_Soundex));
}
?>
Marc Quinton.
06-Jan-2005 03:53
justin at NO dot blukrew dot SPAM dot com
21-Sep-2004 01:18
I originally looked at soundex() because I wanted to compare how individual letters sounded. So, when pronouncing a string of generated characters it would be easy to to distinguish them from eachother. (ie, TGDE is hard to distinguish, whereas RFQA is easier to understand). The goal was to generate IDs that could be easily understood with a high degree of accuracy over a radio of varying quality. I quickly figured out that soundex and metaphone wouldn't do this (they work for words), so I wrote the following to help out. The ID generation function iteratively calls chrSoundAlike() to compare each new character with the preceeding characters. I'd be interested in recieving any feedback on this. Thanks.
<?php
function chrSoundAlike($char1, $char2, $opts = FALSE) {
$char1 = strtoupper($char1);
$char2 = strtoupper($char2);
$opts = strtoupper($opts);
switch ($opts) {
case 'NUMBERS':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z', '3'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;
case 'STRICT':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;
case 'BOTH':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z', '3'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;
default:
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U'));
break;
}
$matchset = array();
for ($i = 0; $i < count($sets); $i++) {
if (in_array($char1, $sets[$i])) {
$matchset = $sets[$i];
}
}
if (in_array($char2, $matchset) OR $char1 == $char2) {
return TRUE;
} else {
return FALSE;
}
}
?>
mail at gettheeawayspam dot iaindooley dot com
11-Jul-2003 08:04
The soundex 'different letter in front' problem can be solved by using levenshtein() on the soundex codes. in my application, which is searching a database of album names for entries that match a particular user provided string, i do the following:
1. Search the database for the exact name
2. Search the database for entries where the name occurs anyway as a string
3. Search the database for entries where any of the words in the name (if the user has typed in more than one word) is present, except for little words (and, the, of etc)
4. Then, if all this fails, I go to plan b:
- calculate the levenshtein distance (levenshtein()) between the user search term and each of the entries in the database as a percentage of the length of the user search term entered
- calculate the levenshtein distance between the metphone codes of the user search term entered and each field in the database as a percentage of the length of the metaphone code of the user search term entered
- calculate the levenshtein distance between the soundex codes of the user search term entered and each field in the database as a percentage of the length of the soundex code of the original user search term entered
if any of these percentages is less than 50 (means that two soundex codes with different first letters will be accepted!!) then the entry is accepted as a possible match.
php.net AT djwice DoT com
26-Jun-2003 02:01
Ik made the Soundex in JavaScript.
http://www.vanderharg.nl/soundex.php
Explanation of the algoritm is on the above page.
It returns two values if a name has "van der" or something alike in it. One with that in the Soundex test and one without.
The use of regular expressions makes the ectual soundex algoritm short. Two conditions of the algoritm I did remove because in this implementation they are redundant.
<script language="javascript">
var koppelteken = ""; // kan ook - zijn.
var vv=new Array(
"de la ",
"in het ",
"in 't ",
"op den ",
"op het ",
"op de ",
"op te ",
"op 't ",
"up te",
"uit de ",
"van den ",
"van der ",
"van het ",
"van de ",
"van 't ",
"opte ",
"upte ",
"con ",
"den ",
"der ",
"ten ",
"ter ",
"van ",
"de ",
"di ",
"du ",
"la ",
"le ",
"te ",
"vd ",
"l' ",
"l'",
"'t ");
function removePrefix(name)
{
i=0;
var strippedresult = "";
while ((strippedresult=="")&&(i<vv.length))
{
if (name.substr(0,vv[i].length)==vv[i].toUpperCase())
strippedresult = name.substr(vv[i].length);
i++;
}
return strippedresult;
}
function Soundex(name)
{
if (name.length>1)
{
// zet om naar hoofdletters.
name=name.toUpperCase();
// converteer leestekens
re = new RegExp ('[
|