msn email google-talk twitter tumblr flickr

Weighted random methods for Array and Normal distribution in Ruby

标记两个最近用到的Ruby方法:数组的加权随机方法,以及正态分布.

都是网上挖来的,经过实践验证.运行环境是 ruby-1.9.2.

数组的加权随机方法

直接打开Array注入random()randomize()方法,两个方法都接受一个权重数组并倾向于权重数值较大的元素.random()按权重随机返回值,randomize()按权重随机排序,详细见代码注释:

   1      class Array
   2        # Chooses a random array element from the receiver based on the weights
   3        # provided. If _weights_ is nil, then each element is weighed equally.
   4        # 
   5        #   [1,2,3].random          #=> 2
   6        #   [1,2,3].random          #=> 1
   7        #   [1,2,3].random          #=> 3
   8        #
   9        # If _weights_ is an array, then each element of the receiver gets its
  10        # weight from the corresponding element of _weights_. Notice that it
  11        # favors the element with the highest weight.
  12        #
  13        #   [1,2,3].random([1,4,1]) #=> 2
  14        #   [1,2,3].random([1,4,1]) #=> 1
  15        #   [1,2,3].random([1,4,1]) #=> 2
  16        #   [1,2,3].random([1,4,1]) #=> 2
  17        #   [1,2,3].random([1,4,1]) #=> 3
  18        #
  19        # If _weights_ is a symbol, the weight array is constructed by calling
  20        # the appropriate method on each array element in turn. Notice that
  21        # it favors the longer word when using :length.
  22        #
  23        #   ['dog', 'cat', 'hippopotamus'].random(:length) #=> "hippopotamus"
  24        #   ['dog', 'cat', 'hippopotamus'].random(:length) #=> "dog"
  25        #   ['dog', 'cat', 'hippopotamus'].random(:length) #=> "hippopotamus"
  26        #   ['dog', 'cat', 'hippopotamus'].random(:length) #=> "hippopotamus"
  27        #   ['dog', 'cat', 'hippopotamus'].random(:length) #=> "cat"
  28        def random(weights=nil)
  29          return random(map {|n| n.send(weights)}) if weights.is_a? Symbol
  30  
  31          weights ||= Array.new(length, 1.0)
  32          total = weights.inject(0.0) {|t,w| t+w}
  33          point = rand * total
  34          # p total
  35          # p zip(weights)
  36  
  37          zip(weights).each do |n,w|
  38            # p "n#{n}   w#{w}  point#{point}"
  39            return n if w >= point
  40            point -= w
  41          end
  42        end
  43  
  44        # Generates a permutation of the receiver based on _weights_ as in
  45        # Array#random. Notice that it favors the element with the highest
  46        # weight.
  47        #
  48        #   [1,2,3].randomize           #=> [2,1,3]
  49        #   [1,2,3].randomize           #=> [1,3,2]
  50        #   [1,2,3].randomize([1,4,1])  #=> [2,1,3]
  51        #   [1,2,3].randomize([1,4,1])  #=> [2,3,1]
  52        #   [1,2,3].randomize([1,4,1])  #=> [1,2,3]
  53        #   [1,2,3].randomize([1,4,1])  #=> [2,3,1]
  54        #   [1,2,3].randomize([1,4,1])  #=> [3,2,1]
  55        #   [1,2,3].randomize([1,4,1])  #=> [2,1,3]
  56        def randomize(weights=nil)
  57          return randomize(map {|n| n.send(weights)}) if weights.is_a? Symbol
  58  
  59          weights = weights.nil? ? Array.new(length, 1.0) : weights.dup
  60  
  61          # pick out elements until there are none left
  62          list, result = self.dup, []
  63          until list.empty?
  64            # pick an element
  65            result << list.random(weights)
  66            # remove the element from the temporary list and its weight
  67            weights.delete_at(list.index(result.last))
  68            list.delete result.last
  69          end
  70  
  71          result
  72        end
  73      end

From: http://snippets.dzone.com/posts/show/898

正态分布算法的Ruby实现

概念:Normal Distribution 正态分布一种概率分布。正态分布是具有两个参数μ和σ2 的连续型随机变量的分布,第一参数μ是服从正态分布的随机变量的均值,第二个参数σ2 是此随机变量的方差,所以正态分布记作N(μ,σ2 )。 服从正态分布的随机变量的概率规律为取与μ邻近的值的概率大 ,而取离μ越远的值的概率越小;σ越小,分布越集中在μ附近,σ越大,分布越分散。正态分布的密度函数的特点是:关于μ对称,在μ处达到最大值,在正(负)无穷远处取值为0,在μ±σ处有拐点。它的形状是中间高两边低 ,图像是一条位于x轴上方的钟形曲线。当μ=0,σ2 =1时,称为标准正态分布,记为N(0,1)。

   1      class RandomGaussian
   2        def initialize(mean = 0.0, stddev = 1.0 , rand_helper = lambda { Kernel.rand })
   3          @rand_helper = rand_helper
   4          @mean = mean
   5          @stddev = stddev
   6          @valid = false
   7          @next = 0
   8        end
   9  
  10        def rand
  11          if @valid then
  12            @valid = false
  13            return @next
  14          else
  15            @valid = true
  16            x, y = self.class.gaussian(@mean, @stddev, @rand_helper)
  17            @next = y
  18            return x
  19          end
  20        end
  21  
  22        private
  23        def self.gaussian(mean, stddev, rand)
  24          theta = 2 * Math::PI * rand.call
  25          rho = Math.sqrt(-2 * Math.log(1 - rand.call))
  26          scale = stddev * rho
  27          x = mean + scale * Math.cos(theta)
  28          y = mean + scale * Math.sin(theta)
  29          return x, y
  30        end
  31      end

第一参数是正态分布的随机变量的均值,第二个参数此随机变量的方差,默认参数是标准正态分布.直接调用实例方法rand()就行了:

   1      RandomGaussian.new.rand

需要看分布状态自己迭代下就明白了.

From: http://stackoverflow.com/questions/5825680/code-to-generate-gaussian-normally-distributed-random-numbers-in-ruby