如何检查字符串是否仅由字母和数字组成? (PHP)

Title says it all: I am checking to see if a user's username contains anything that isn't a number or letter, such as €{¥]^}+<€, punctuation, spaces or even things like âæłęč. Is this possible in php?

  • doulu3808 2019-05-26 01:40

    You can use the ctype_alnum() function in PHP.

    From the manual..

    Check for alphanumeric character(s)
    Returns TRUE if every character in text is either a letter or a digit, FALSE otherwise.

    var_dump(ctype_alnum("æøsads")); // false
    var_dump(ctype_alnum("123asd")); // true
  • dsjojts9734 2019-05-26 07:49

    PHP does REGEX

    What you want to do is fairly trivial, PHP has a number of regex functions

    Testing a String For a Character

    If all you want is to know IF a string contains non-alphanumeric characters, then just use preg_match():

    preg_match( '/[^A-Z|a-z|0-9]*/', $userName );

    This will return 1 if the username contains anything other than alphanumeric (A-Z or a-z or 0to9), it returns 0 if it doesn't contain a non-alphanumeric.

    Regex Pattern Elements

    Regex PCRE patterns open and close with a delimiter such as a slash/, and that needs to be treated like a string (quoted):'/myPattern/' Some other key features are:

    [ brackets contain match sets ]
    [a-z] // means match any lowercase letter This pattern means check the current character in the $String relative to the pattern in these brackets, in this case match any lowercase letter a to z.

    ^ Caret (Meta-Character)
    [^a-z] // means no lowercase letters If the caret ^ (aka hat) is the first character inside brackets, it NEGATES the pattern inside brackets so [^A|7] means match anything EXCEPT uppercase A and the numeral 7. (Note: when outside brackets, the caret ^ means the start of the string.)

    \wWdDsS. Meta-Characters (WildCards)
    \w // match all alphanumeric An escaped (i.e. preceded by a backslash \ ) lowercase w means match any "word" character, i.e. alphanumeric and the underscore _, this is shorthand for [A-Z|a-z|0-9|_]. The uppercase \W is the NOT word character, equivalent to [^A-Z|a-z|0-9|_] or [^\w]

    .   // (dot) match ANY single character except return/newline
    \w  // match any word character [A-Z|a-z|0-9|_]
    \W  // NOT any word character [^A-Z|a-z|0-9|_]
    \d  // match any digit [0-9]
    \D  // NOT any digit [^0-9].
    \s  // match any whitespace (tab, space, newline)
    \S  // NOT any whitespace 

    .*+?| Meta-Characters (Quantifiers))
    These modify the behavior of the above.

    *   // match previous character or [set] zero or more times, 
        // so .* means match everything (including nothing) until reaching a return/newline.
    +   // match previous at least one or more times.
    ?   // match previous only zero or one time (i.e. optional).
    |   // means logical OR, so: 
        [A-C|X-Z] // means match any of A,B,C,X,Y, or Z

    Not shown: capture groups, backreferences, substitution (the real power of regex). See https://www.phpliveregex.com/#tab-preg-match for more including a live pattern-match playground that is based on the PHP functions, and delivers results as arrays.

    Back To Your StringCleaning

    So for your pattern, to match all non-letters and numbers (including underscores) you need either: '/[^A-Z|a-z|0-9]*/' or '/[\W|_]*/'

    Strip Search

    If instead you want to STRIP all the non-alpha characters from a string then use preg_replace( $Regex, $Replacement, $StringToClean )

        $username = 'Svéñ déGööfinøff';
        echo preg_replace('/[\W|_]*/', '', $username);

    The output is: SvdGfinff If you'd prefer to replace certain accented letters with standard latin ones to keep the names reasonably readable, then I believe you'd need a lookup table (array). There is one ready to use at the PHP site

