-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add taint flows for remaining built-in pure functions such as utf8_decode, max($strings), etc #3636
Comments
Good point, I've fixed that |
Miscellaneous notes I might find useful if working on this in the future: In src/Psalm/Internal/Analyzer/Statements/Expression/Call/FunctionCallAnalyzer.php,
|
I improved things a little in e8be2c5, adding support for |
It might be useful to generate a map of all tainted json_encoded data that could be passed to a JS taint analysis tool. |
Some code to generate candidates is below - this excludes functions that are possibly impure depending on their args. Some obvious ones are commented out. chop() is an alias of rtrim(). I don't know how taint detection currently works with array keys/values or how it is meant to work <?php
use Phan\Plugin\Internal\UseReturnValuePlugin;
use Phan\Language\FQSEN\FullyQualifiedFunctionName;
use Phan\Language\UnionType;
use Phan\Language\Element\FunctionInterface;
require_once dirname(__DIR__) . '/src/Phan/Bootstrap.php';
$code_base = require(dirname(__DIR__) . '/src/codebase.php');
$unsafe_types = UnionType::fromFullyQualifiedRealString('string|array');
$isPotentialTaintPropogator = function (FunctionInterface $function) use ($code_base, $unsafe_types): bool {
$function_return_type = $function->getUnionType();
if (!$function_return_type->canCastToUnionType($unsafe_types)) {
return false;
}
foreach ($function->getParameterList() as $param) {
if ($param->getUnionType()->canCastToUnionType($unsafe_types)) {
return true;
}
}
return false;
};
foreach (UseReturnValuePlugin::HARDCODED_FQSENS as $fqsen_string => $value) {
if (strpos($fqsen_string, '::') !== false) {
continue;
}
if ($value !== true) {
continue;
}
$fqsen = FullyQualifiedFunctionName::fromFullyQualifiedString($fqsen_string);
if (!$code_base->hasFunctionWithFQSEN($fqsen)) {
continue;
}
$function = $code_base->getFunctionByFQSEN($fqsen);
// echo "looking up $fqsen\n";
if (!$isPotentialTaintPropogator($function)) {
continue;
}
echo "$function\n";
} <?php
// Limitations:
// - Excludes uncommon functions like hebrev()
// - Excludes potentially impure functions such as var_export(), highlight_string()
// Returns original string if no translation is found
function _(string $message) : string;
// prefer htmlentities/escapeshellarg()
function addcslashes(string $str, string $charlist) : string;
function addslashes(string $str) : string;
// Taint checking probably won't be able to check if keys are tainted.
function array_change_key_case(array $input, int $case = unknown) : associative-array<mixed,mixed>;
function array_chunk(array $input, int $size, bool $preserve_keys = unknown) : list<array>;
function array_column(array $array, mixed $column_key, mixed $index_key = unknown) : array;
function array_combine(int[]|string[] $keys, array $values) : associative-array<mixed,mixed>|false;
function array_count_values(array $input) : associative-array<mixed,int>;
function array_diff_assoc(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_diff_key(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_diff(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_fill_keys(array $keys, mixed $val) : array;
function array_fill(int $start_key, int $num, mixed $val) : array<int,mixed>;
function array_filter(array $input, callable(mixed):bool|callable(mixed,mixed):bool $callback = unknown, int $flag = unknown) : associative-array<mixed,mixed>;
function array_flip(array $input) : associative-array<mixed,int>|associative-array<mixed,string>;
function array_intersect_assoc(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_intersect_key(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_intersect(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed>;
function array_key_first(array $array) : int|null|string;
function array_key_last(array $array) : int|null|string;
function array_keys(array $input, mixed $search_value = unknown, bool $strict = unknown) : list<int>|list<string>;
function array_map(?callable $callback, array $input1, array ...$args) : array;
function array_merge_recursive(array $arr1, array ...$args) : array;
function array_merge(array $arr1, array ...$args) : array;
function array_pad(array $input, int $pad_size, mixed $pad_value) : array;
function array_rand(array $input, int $num_req) : array<int,int>|array<int,string>|int|string;
function array_reduce(array $input, callable(mixed,mixed):mixed $callback, mixed $initial = unknown) : mixed;
function array_replace_recursive(array $arr1, array $arr2, array ...$args) : array;
function array_replace(array $arr1, array $arr2, array ...$args) : array;
function array_reverse(array $input, bool $preserve = unknown) : array;
function array_search(mixed $needle, array $haystack, bool $strict = unknown) : false|int|string;
function array_slice(array $input, int $offset, ?int $length = null, bool $preserve_keys = unknown) : array;
function array_unique(array $input, int $sort_flags = unknown) : associative-array<mixed,mixed>;
function array_values(array $input) : list<mixed>;
function base64_decode(string $str, bool $strict = unknown) : false|string;
// function base64_encode(string $str) : string;
// function base_convert(string $number, int $frombase, int $tobase) : string;
function basename(string $path, string $suffix = unknown) : string;
// function bin2hex(string $data) : string;
function bzcompress(string $source, int $blocksize100k = unknown, int $workfactor = unknown) : int|string;
function bzdecompress(string $source, int $small = unknown) : int|string;
function chop(string $str, string $character_mask = unknown) : string;
function chunk_split(string $str, int $chunklen = unknown, string $ending = unknown) : string;
// function class_implements(object|string $what, bool $autoload = unknown) : array<string,class-string>|false;
// function class_parents(object|string $instance, bool $autoload = unknown) : array<string,class-string>|false;
function compact(array|string $var_name, array|string ...$var_names) : array;
function convert_cyr_string(string $str, string $from, string $to) : string;
function convert_uudecode(string $data) : string;
function convert_uuencode(string $data) : string;
// function count_chars(string $input, int $mode = unknown) : array<int,int>|false|string;
function current(array|object $array_arg) : false|mixed;
function date(string $format, int $timestamp = unknown) : string;
function dirname(string $path, int $levels = unknown) : string;
function each(array &$arr) : array;
// eval safe?
function escapeshellarg(string $arg) : string;
function explode(string $separator, string $str, int $limit = unknown) : list<string>;
// function fgetcsv(resource $fp, int $length = unknown, string $delimiter = unknown, string $enclosure = unknown, string $escape = unknown) : false|list<?string>;
// function file(string $filename, int $flags = unknown, resource $context = unknown) : false|list<string>;
// filter_input filter types depends on $type/$filter
// function filter_input_array(int $type, array|int $definition = unknown, bool $add_empty = unknown) : false|mixed;
// function filter_input(int $type, string $variable_name, int $filter = unknown, array|int $options = unknown) : false|mixed;
// function filter_var(mixed $variable, int $filter = unknown, mixed $options = unknown) : false|mixed;
// function filter_var_array
// function get_cfg_var(string $option_name) : array[]|false|string|string[];
// function get_class_methods(mixed $class) : list<string>;
// function getenv(string $varname, bool $local_only = unknown) : false|string;
// function getimagesize(string $imagefile, array &$info = unknown) : false|int[]|string[];
// function get_parent_class(mixed $object = unknown) : class-string|false;
function gettext(string $msgid) : string;
// function gettype(mixed $var) : string;
// Can unescape $format with backslashes if user controlled
function gmdate(string $format, int $timestamp = unknown) : false|string;
function gzcompress(string $data, int $level = unknown, int $encoding = unknown) : false|string;
function gzdecode(string $data, int $length = unknown) : false|string;
function gzdeflate(string $data, int $level = unknown, int $encoding = unknown) : false|string;
function gzencode(string $data, int $level = unknown, int $encoding_mode = unknown) : false|string;
function gzinflate(string $data, int $length = unknown) : false|string;
function gzuncompress(string $data, int $length = unknown) : false|string;
// unsafe with $raw_output = true
// function hash_hmac(string $algo, string $data, string $key, bool $raw_output = unknown) : string;
// function hash_pbkdf2(string $algo, string $password, string $salt, int $iterations, int $length = unknown, bool $raw_output = unknown) : string;
// function hash(string $algo, string $data, bool $raw_output = unknown) : string;
function hex2bin(string $data) : false|string;
function htmlentities(string $string, int $quote_style = unknown, string $encoding = unknown, bool $double_encode = unknown) : string;
function html_entity_decode(string $string, int $quote_style = unknown, string $encoding = unknown) : string;
function htmlspecialchars_decode(string $string, int $quote_style = unknown) : string;
function htmlspecialchars(string $string, int $quote_style = unknown, string $encoding = unknown, bool $double_encode = unknown) : string;
function http_build_query(array|object $querydata, string $prefix = unknown, string $arg_separator = unknown, int $enc_type = unknown) : string;
function iconv(string $in_charset, string $out_charset, string $str) : false|string;
function implode(string $glue, array $pieces) : string;
//function inet_ntop(string $in_addr) : false|string;
//function inet_pton(string $ip_address) : false|string;
// function ini_get(string $varname) : false|string;
function join(string $glue, array $pieces) : string;
function json_decode(string $json, bool $assoc = unknown, int $depth = unknown, int $options = unknown) : mixed;
function json_encode(mixed $data, int $options = unknown, int $depth = unknown) : false|string;
function key(array|object $array_arg) : int|null|string;
function lcfirst(string $str) : string;
// function long2ip(int|string $proper_address) : string;
// already done
function ltrim(string $str, string $character_mask = unknown) : string;
// max() also works on strings.
function max(array $arg1) : mixed;
function mb_convert_case(string $sourcestring, int $mode, string $encoding = unknown) : false|string;
function mb_convert_encoding(string $str, string $to_encoding, string|string[] $from_encoding = unknown) : false|string;
function mb_detect_encoding(string $str, mixed $encoding_list = unknown, bool $strict = unknown) : false|string;
function mb_strtolower(string $str, string $encoding = unknown) : false|string;
function mb_substr(string $str, int $start, ?int $length = null, string $encoding = unknown) : false|string;
// Probably unrealistically wrong if $raw_output = true and sent to a sink
// function md5_file(string $filename, bool $raw_output = unknown) : false|string;
// function md5(string $str, bool $raw_output = unknown) : string;
// metaphone filters out non-letters?
// function metaphone(string $text, int $phones = unknown) : false|string;
function min(array $arg1) : mixed;
function ngettext(string $msgid1, string $msgid2, int $n) : string;
function nl2br(string $str, bool $is_xhtml = unknown) : string;
// $key is probably secret from application
// function openssl_encrypt(string $data, string $method, string $key, int $options = unknown, string $iv = unknown, string &$tag = unknown, string $aad = unknown, int $tag_length = unknown) : false|string;
function pack(string $format, mixed ...$args) : string;
// function parse_ini_file(string $filename, bool $process_sections = unknown, int $scanner_mode = unknown) : array|false;
// depends on arguments
// function parse_url(string $url, int $url_component = unknown) : array{scheme?:string,host?:string,port?:int,user?:string,pass?:string,path?:string,query?:string,fragment?:string}|false|int|null|string;
// function pathinfo(string $path, int $options = unknown) : array|string;
// function php_uname(string $mode = unknown) : string;
// function phpversion(string $extension = unknown) : false|string;
function preg_filter(mixed $regex, mixed $replace, mixed $subject, int $limit = unknown, int &$count = unknown) : string|string[];
function preg_grep(string $regex, array $input, int $flags = unknown) : array;
function preg_quote(string $str, string $delim_char = unknown) : string;
function preg_replace_callback(array|string $regex, callable(array):string $callback, array|string $subject, int $limit = unknown, int &$count = unknown) : string|string[];
function preg_replace_callback_array(array<string,callable(array):string> $pattern, array|string $subject, int $limit = unknown, int &$count = unknown) : string|string[];
function preg_replace(array|string $regex, array|string $replace, array|string $subject, int $limit = unknown, int &$count = unknown) : string|string[];
function preg_split(string $pattern, string $subject, ?int $limit = null, int $flags = unknown) : list<string>;
function quoted_printable_decode(string $str) : string;
function quoted_printable_encode(string $str) : string;
function quotemeta(string $str) : string;
// function range(mixed $low, mixed $high, float|int $step = unknown) : array;
function rawurldecode(string $str) : string;
function rawurlencode(string $str) : string;
function readlink(string $filename) : false|string;
function realpath(string $path) : false|string;
// already done
function rtrim(string $str, string $character_mask = unknown) : string;
function serialize(mixed $variable) : string;
// depends on raw_output, but impractical
// function sha1(string $str, bool $raw_output = unknown) : string;
// function soundex(string $str) : string;
function sprintf(string $format, float|int|string ...$vars) : string;
// function stat(string $filename) : array|false;
function strchr(string $haystack, int|string $needle, bool $before_needle = unknown) : false|string;
// function stream_resolve_include_path(string $filename) : false|string;
function strftime(string $format, int $timestamp = unknown) : string;
function stripcslashes(string $str) : string;
function stripslashes(string $str) : string;
function strip_tags(string $str, string|string[] $allowable_tags = unknown) : string;
function str_ireplace(array|string $search, array|string $replace, array|string $subject, int &$replace_count = unknown) : string|string[];
function stristr(string $haystack, int|string $needle, bool $before_needle = unknown) : false|string;
function str_pad(string $input, int $pad_length, string $pad_string = unknown, int $pad_type = unknown) : string;
function strpbrk(string $haystack, string $char_list) : false|string;
function strrchr(string $haystack, int|string $needle) : false|string;
function str_repeat(string $input, int $multiplier) : string;
function str_replace(array|string $search, array|string $replace, array|string $subject, int &$replace_count = unknown) : string|string[];
function strrev(string $str) : string;
function str_rot13(string $str) : string;
function str_split(string $str, int $split_length = unknown) : list<string>;
// depends on $needle (and $haystack if $before_needle)
function strstr(string $haystack, int|string $needle, bool $before_needle = unknown) : false|string;
function strtolower(string $str) : string;
function strtoupper(string $str) : string;
function strtr(string $str, string $from, string $to) : string;
function strval(mixed $var) : string;
function str_word_count(string $string, int $format = unknown, string $charlist = unknown) : array<int,string>|int;
function substr_replace(string|string[] $str, mixed $repl, mixed $start, mixed $length = unknown) : string|string[];
function substr(string $str, int $start, int $length = unknown) : false|string;
// mostly html safe but can contain " and >?
function tempnam(string $dir, string $prefix) : false|string;
function token_get_all(string $source, int $flags = unknown) : list<array{0:int,1:string,2:int}>|list<string>;
function trim(string $str, string $character_mask = unknown) : string;
function ucfirst(string $str) : string;
function ucwords(string $str, string $delims = unknown) : string;
function uniqid(string $prefix = unknown, bool $more_entropy = unknown) : string;
function unpack(string $format, string $data, int $offset = unknown) : array|false;
function urldecode(string $str) : string;
// mostly safe
// function urlencode(string $str) : string;
function utf8_decode(string $data) : string;
function utf8_encode(string $data) : string;
function vsprintf(string $format, array $args) : string;
function wordwrap(string $str, int $width = unknown, string $break = unknown, bool $cut = unknown) : string;
function zlib_decode(string $data, int $max_decoded_len = unknown) : string;
function zlib_encode(string $data, int $encoding, int|string $level = unknown) : string; |
And then there's other helpers like UConverter->convert(). I wonder if fuzzing would help build a larger list ahead of time - e.g. in docker, instantiate classes, call methods to check for inputs that would emit |
A second pass at adding functions to src/Psalm/Internal/Stubs/CoreGenericFunctions.phpstub based on the earlier snippet - This helps with join(), strval(), etc, and probably has some incorrect entries Because of the missing type information, it may cause issues, and I'm not sure how psalm will handle the php 8.0 changes (e.g. dropping support for It could be put into a plugin until those issues are worked out, though /**
* @psalm-pure
* @psalm-flow ($message) -> return
*/
function _(string $message) : string {}
// prefer htmlentities/escapeshellarg()
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function addcslashes(string $str, string $charlist) : string {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function addslashes(string $str) : string {}
// Taint checking probably won't be able to check if keys are tainted.
// /** @return associative-array<mixed, mixed> */
// function array_change_key_case(array $input, int $case = 0) : associative-array<mixed,mixed> {}
// function array_chunk(array $input, int $size, bool $preserve_keys = false) : list<array> {}
// function array_column(array $array, $column_key, $index_key = null) : array {}
// function array_combine(int[]|string[] $keys, array $values) : associative-array<mixed,mixed> {}
// function array_count_values(array $input) : associative-array<mixed,int> {}
// function array_diff_assoc(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_diff_key(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_diff(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_fill_keys(array $keys, $val) : array {}
// function array_fill(int $start_key, int $num, $val) : array<int,mixed> {}
// function array_filter(array $input, callable(mixed):bool|callable(mixed,mixed):bool $callback = null, int $flag = 0) : associative-array<mixed,mixed> {}
// function array_flip(array $input) : associative-array<mixed,int>|associative-array<mixed,string> {}
// function array_intersect_assoc(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_intersect_key(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_intersect(array $arr1, array $arr2, array ...$args) : associative-array<mixed,mixed> {}
// function array_key_first(array $array) : int|null|string {}
// function array_key_last(array $array) : int|null|string {}
// function array_keys(array $input, $search_value = unknown, bool $strict = false) : list<int>|list<string> {}
// function array_map(?callable $callback, array $input1, array ...$args) : array {}
// function array_merge_recursive(array $arr1, array ...$args) : array {}
// function array_merge(array $arr1, array ...$args) : array {}
// function array_pad(array $input, int $pad_size, $pad_value) : array {}
// function array_rand(array $input, int $num_req) : array<int,int>|array<int,string>|int|string {}
// function array_reduce(array $input, callable(mixed,mixed):$callback, $initial = null) {}
// function array_replace_recursive(array $arr1, array $arr2, array ...$args) : array {}
// function array_replace(array $arr1, array $arr2, array ...$args) : array {}
// function array_reverse(array $input, bool $preserve = false) : array {}
// function array_search($needle, array $haystack, bool $strict = false) : false|int|string {}
// function array_slice(array $input, int $offset, ?int $length = null, bool $preserve_keys = false) : array {}
// function array_unique(array $input, int $sort_flags = 2) : associative-array<mixed,mixed> {}
// function array_values(array $input) : list<mixed> {}
/**
* @psalm-pure
*
* @return string|false
*
* @psalm-flow ($str) -> return
*/
function base64_decode(string $str, bool $strict = false) {}
// function base64_encode(string $str) : string {}
// function base_convert(string $number, int $frombase, int $tobase) : string {}
/**
* @psalm-pure
* @psalm-flow ($path) -> return
*/
function basename(string $path, string $suffix = '') : string {}
// function bin2hex(string $data) : string {}
/**
* @return int|string
* @psalm-pure
* @psalm-flow ($source) -> return
*/
function bzcompress(string $source, int $blocksize100k = 4, int $workfactor = 0) {}
/**
* @return int|string
* @psalm-pure
* @psalm-flow ($source) -> return
*/
function bzdecompress(string $source, int $small = 0) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function chop(string $str, string $character_mask = '
�' . "\0" . '') : string {}
/**
* @psalm-pure
* @psalm-flow ($str, $ending) -> return
*/
function chunk_split(string $str, int $chunklen = 76, string $ending = '
') : string {}
// function class_implements(object|string $what, bool $autoload = unknown) : array<string,class-string>|false {}
// function class_parents(object|string $instance, bool $autoload = unknown) : array<string,class-string>|false {}
/**
* @psalm-pure
* @psalm-flow ($var_name, $var_names) -> return
*/
function compact($var_name, ...$var_names) : array {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function convert_cyr_string(string $str, string $from, string $to) : string {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function convert_uudecode(string $data) : string {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function convert_uuencode(string $data) : string {}
// function count_chars(string $input, int $mode = unknown) : array<int,int>|false|string {}
/**
* @param object|array $array_arg
* @psalm-pure
* @psalm-flow ($array_arg) -> return
*/
function current($array_arg) {}
/**
* @psalm-pure
* @psalm-flow ($path) -> return
*/
function dirname(string $path, int $levels = 1) : string {}
/**
* @psalm-taint-specialize
* @psalm-flow ($arr) -> return
*/
function each(array &$arr) : array {}
// eval safe?
/**
* @psalm-pure
* @psalm-flow ($arg) -> return
* @psalm-taint-escape shell
*/
function escapeshellarg(string $arg) : string {}
// function fgetcsv(resource $fp, int $length = unknown, string $delimiter = unknown, string $enclosure = unknown, string $escape = unknown) : false|list<?string> {}
// function file(string $filename, int $flags = unknown, resource $context = unknown) : false|list<string> {}
// filter_input filter types depends on $type/$filter
// function filter_input_array(int $type, array|int $definition = unknown, bool $add_empty = unknown) : false|mixed {}
// function filter_input(int $type, string $variable_name, int $filter = unknown, array|int $options = unknown) : false|mixed {}
// function filter_var($variable, int $filter = unknown, $options = unknown) : false|mixed {}
// function filter_var_array
// function get_cfg_var(string $option_name) : array[]|false|string|string[] {}
// function get_class_methods($class) : list<string> {}
// function getenv(string $varname, bool $local_only = unknown) : false|string {}
// function getimagesize(string $imagefile, array &$info = unknown) : false|int[]|string[] {}
// function get_parent_class($object = unknown) : class-string|false {}
function gettext(string $msgid) : string {}
// function gettype($var) : string {}
// Can unescape $format with backslashes if user controlled
function gmdate(string $format, int $timestamp = null) : string {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzcompress(string $data, int $level = -1, int $encoding = 15) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzdecode(string $data, int $length = 0) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzdeflate(string $data, int $level = -1, int $encoding = -15) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzencode(string $data, int $level = -1, int $encoding_mode = 31) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzinflate(string $data, int $length = 0) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function gzuncompress(string $data, int $length = 0) {}
// unsafe with $raw_output = true
// function hash_hmac(string $algo, string $data, string $key, bool $raw_output = unknown) : string {}
// function hash_pbkdf2(string $algo, string $password, string $salt, int $iterations, int $length = unknown, bool $raw_output = unknown) : string {}
// function hash(string $algo, string $data, bool $raw_output = unknown) : string {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($data) -> return
*/
function hex2bin(string $data) {}
/**
* @psalm-pure
* @param array|object $querydata
* @psalm-flow ($querydata) -> return
*/
function http_build_query($querydata, string $prefix = '', string $arg_separator = '', int $enc_type = 1) : string {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($str) -> return
*/
function iconv(string $in_charset, string $out_charset, string $str) {}
//function inet_ntop(string $in_addr) {}
//function inet_pton(string $ip_address) {}
// function ini_get(string $varname) {}
/**
* @psalm-pure
* @psalm-flow ($glue, $pieces) -> return
*/
function join(string $glue, array $pieces) : string {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
* TODO What taints does this unescape? (\uxxxx can quote)
*/
function json_decode(string $json, bool $assoc = null, int $depth = 512, int $options = 0) {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
* @psalm-taint-escape html
* @return false|string
*/
function json_encode($data, int $options = 0, int $depth = 512) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function lcfirst(string $str) : string {}
// function long2ip(int|string $proper_address) : string {}
// already done
// max() also works on strings.
/**
* @psalm-pure
* @psalm-flow ($arg1) -> return
*/
function max(array $arg1) {}
/**
* @psalm-pure
* @psalm-flow ($sourcestring) -> return
*/
function mb_convert_case(string $sourcestring, int $mode, string $encoding = null) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($str) -> return
*/
function mb_convert_encoding(string $str, string $to_encoding, $from_encoding = false) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($str) -> return
*/
function mb_detect_encoding(string $str, $encoding_list = null, bool $strict = false) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($str) -> return
*/
function mb_strtolower(string $str, string $encoding = null) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($str) -> return
*/
function mb_substr(string $str, int $start, ?int $length = null, string $encoding = '') {}
// Probably unrealistically wrong if $raw_output = true and sent to a sink
// function md5_file(string $filename, bool $raw_output = unknown) {}
// function md5(string $str, bool $raw_output = unknown) : string {}
// metaphone filters out non-letters?
// function metaphone(string $text, int $phones = unknown) {}
/**
* @psalm-pure
* @return false|string
* @psalm-flow ($arg1) -> return
*/
function min(array $arg1) {}
/**
* @psalm-pure
* @return string
* @psalm-flow ($str) -> return
*/
function ngettext(string $msgid1, string $msgid2, int $n) : string {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function nl2br(string $str, bool $is_xhtml = false) : string {}
// $key is probably secret from application
// function openssl_encrypt(string $data, string $method, string $key, int $options = unknown, string $iv = unknown, string &$tag = unknown, string $aad = unknown, int $tag_length = unknown) {}
// function pack(string $format, mixed ...$args) : string {}
// function parse_ini_file(string $filename, bool $process_sections = unknown, int $scanner_mode = unknown) {}
// depends on arguments
// function parse_url(string $url, int $url_component = unknown) : array{scheme?:string,host?:string,port?:int,user?:string,pass?:string,path?:string,query?:string,fragment?:string}|false|int|null|string {}
// function pathinfo(string $path, int $options = unknown) {}
// function php_uname(string $mode = unknown) : string {}
// function phpversion(string $extension = unknown) {}
/**
* @psalm-pure
* @psalm-flow ($subject) -> return
*/
function preg_filter($regex, $replace, $subject, int $limit = -1, int &$count = null) {}
/**
* @psalm-pure
* @psalm-flow ($subject) -> return
*/
function preg_replace_callback_array(array $pattern, $subject, int $limit = -1, int &$count = null) {}
/**
* @psalm-pure
* @psalm-flow ($subject) -> return
*/
function preg_split(string $pattern, string $subject, ?int $limit = -1, int $flags = 0) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function quoted_printable_decode(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function quoted_printable_encode(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function quotemeta(string $str) {}
// function range($low, $high, float|int $step = unknown) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
* @psalm-taint-unescape html
*/
function rawurldecode(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
* @psalm-taint-escape html
*/
function rawurlencode(string $str) {}
// not pure
// function readlink(string $filename) {}
// already done
/**
* @psalm-pure depending on definition
* @psalm-flow ($variable) -> return
*/
function serialize($variable) {}
// depends on raw_output, but impractical
// function sha1(string $str, bool $raw_output = unknown) {}
// function soundex(string $str) {}
// function stat(string $filename) {}
/**
* @psalm-pure
* @psalm-flow ($needle) -> return
* TODO support before_needle
*/
function strchr(string $haystack, $needle, bool $before_needle = false) {}
// function stream_resolve_include_path(string $filename) {}
/**
* @psalm-pure
* @psalm-flow ($format) -> return
* Backslashes can be used for special characters
*/
function strftime(string $format, int $timestamp = null) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function stripcslashes(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function stripslashes(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($replace, $subject) -> return
*/
function str_ireplace($search, $replace, $subject, int &$replace_count = 0) {}
/**
* @psalm-pure
* @psalm-flow ($needle) -> return
*/
function stristr(string $haystack, $needle, bool $before_needle = false) {}
/**
* @psalm-pure
* @psalm-flow ($input, $pad_string) -> return
*/
function str_pad(string $input, int $pad_length, string $pad_string = '', int $pad_type = 0) {}
/**
* @psalm-pure
* @psalm-flow ($haystack) -> return
*/
function strpbrk(string $haystack, string $char_list) {}
/**
* @psalm-pure
* @psalm-flow ($haystack, $needle) -> return
*/
function strrchr(string $haystack, $needle) {}
/**
* @psalm-pure
* @psalm-flow ($input) -> return
*/
function str_repeat(string $input, int $multiplier) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function strrev(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function str_rot13(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function str_split(string $str, int $split_length = 1) {}
// depends on $needle (and $haystack if $before_needle)
/**
* @psalm-pure
* @psalm-flow ($needle) -> return
* TODO support before_needle=true
*/
function strstr(string $haystack, string $needle, bool $before_needle = false) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function strtr(string $str, string $from, string $to) {}
/**
* @psalm-pure
* @psalm-flow ($var) -> return
*/
function strval($var) {}
/**
* @psalm-pure
* @psalm-flow ($string) -> return
*/
function str_word_count(string $string, int $format = 0, string $charlist = '') {}
/**
* @psalm-pure
* @psalm-flow ($str, $repl) -> return
*/
function substr_replace($str, $repl, $start, $length = 0) {}
// mostly html safe but can contain " and >?
/**
* @psalm-pure
* @psalm-flow ($dir, $prefix) -> return
*/
function tempnam(string $dir, string $prefix) {}
/**
* @psalm-pure
* @psalm-flow ($source) -> return
*/
function token_get_all(string $source, int $flags = 0) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function ucfirst(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function ucwords(string $str, string $delims = '
�') {}
/**
* @psalm-pure
* @psalm-flow ($prefix) -> return
*/
function uniqid(string $prefix = '', bool $more_entropy = false) {}
/**
* @psalm-pure
* TODO
*/
function unpack(string $format, string $data, int $offset = 0) {}
/**
* TODO: This also may add taints other than html?
* @psalm-pure
* @psalm-flow ($str) -> return
* @psalm-taint-unescape html
*/
function urldecode(string $str) {}
// mostly safe
// function urlencode(string $str) {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function utf8_decode(string $data) {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function utf8_encode(string $data) {}
/**
* @psalm-pure
* @psalm-flow ($format, $args) -> return
*/
function vsprintf(string $format, array $args) {}
/**
* @psalm-pure
* @psalm-flow ($str) -> return
*/
function wordwrap(string $str, int $width = 75, string $break = '
', bool $cut = false) {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function zlib_decode(string $data, int $max_decoded_len = 0) {}
/**
* @psalm-pure
* @psalm-flow ($data) -> return
*/
function zlib_encode(string $data, int $encoding, $level = -1) {} |
This adds string functions from https://www.php.net/manual/en/ref.strings.php This commit adds the flows for functions from "addcslashes" to "sprintf". More are to follow in later commits. Ref vimeo#3636
This adds string functions from https://www.php.net/manual/en/ref.strings.php This commit adds the flows for functions from "addcslashes" to "sprintf". More are to follow in later commits. Ref vimeo#3636
This adds string functions from https://www.php.net/manual/en/ref.strings.php This commit adds the flows for functions from "addcslashes" to "sprintf". More are to follow in later commits. Ref #3636
This adds string functions from https://www.php.net/manual/en/ref.strings.php This commit adds the flows for functions from "addcslashes" to "sprintf". More are to follow in later commits. Ref vimeo#3636
This should conclude all string functions from https://www.php.net/manual/en/book.strings.php Continuation of vimeo#4576 Ref vimeo#3636
This should conclude all string functions from https://www.php.net/manual/en/book.strings.php Continuation of vimeo#4576 Ref vimeo#3636
This should conclude all string functions from https://www.php.net/manual/en/book.strings.php Continuation of vimeo#4576 Ref vimeo#3636
* Add string functions from sscanf to wordwrap This should conclude all string functions from https://www.php.net/manual/en/book.strings.php Continuation of #4576 Ref #3636 * Add StrTrReturnTypeProvider * Fix psalm error * phpcs * Line length * Ignore false return on vsprintf Co-authored-by: Matthew Brown <[email protected]>
This adds string functions from https://www.php.net/manual/en/ref.strings.php This commit adds the flows for functions from "addcslashes" to "sprintf". More are to follow in later commits. Ref vimeo#3636
* Add string functions from sscanf to wordwrap This should conclude all string functions from https://www.php.net/manual/en/book.strings.php Continuation of vimeo#4576 Ref vimeo#3636 * Add StrTrReturnTypeProvider * Fix psalm error * phpcs * Line length * Ignore false return on vsprintf Co-authored-by: Matthew Brown <[email protected]>
I believe all functions in callmap are assumed pure unless they're listed in an 'impure list' array somewhere. Is there still a point for that issue I missed? |
My original request was to add the danog@4de2bf8 and other associated commits did that for the most commonly used ones. However, some remaining less common things such as
https://psalm.dev/r/9b5d105c35 emits "No issues" but I'd expect a taint warning |
I found these snippets: https://psalm.dev/r/9b5d105c35<?php // --taint-analysis
echo urldecode($_GET['x']);
|
It seems like psalm only knows about functions in src/Psalm/Internal/Stubs/CoreGenericFunctions.phpstub . It may help to expand that list of functions to other pure functions such as json_encode(), base64_decode(), trim(), etc.
This example still allows evaluating arbitrary code, such as
echo "$(ls)"
https://github.com/phan/phan/blob/master/src/Phan/Plugin/Internal/UseReturnValuePlugin.php may be of help, because it lists many common "pure" functions that return a value based on their inputs.
Off-topic notes:
Aside: $_REQUEST combines $_GET and $_POST, so should that also be included as a source?
$_COOKIE can be set by browsers, so should that be considered for
eval
(but possibly not html)Aside: json_encode() technically escapes
html
, because the only reasonable place to echo it is inside<script>
. However, it might be a useful sanity check to assert that the file contains</script>
after the echo line but before the next<script>
substring, if any occur. Probably not worth the effort.<p><?= json_encode($unsafe) ?></p>
is potentially unsafe for emitting (malformed) html tagsThe text was updated successfully, but these errors were encountered: