Legal claims defining the scope of protection, as filed with the USPTO.
1. A data processing apparatus comprising: a vector register store configured to store vector operands comprising a plurality of data elements; processing circuitry configured to process vector operands from the vector register store; and control circuitry configured to control the processing circuitry to perform a vector scan operation on M data elements of a source vector operand V[0] to V[M−1] and at least one additional data element S, to generate M data elements of a result vector operand R[0] to R[M−1], where M is a power of 2, each data element of the source vector operand and the result vector operand comprises a plurality of bits, and for N≤M and 0≤i<N data element R[i] of the result vector operand has a value corresponding to a combination of the at least one additional data element S and at least some of data elements V[0] to V[i] of the source vector operand; wherein the control circuitry is configured to control the processing circuitry to perform the vector scan operation in a plurality of steps, each step for generating a second vector from a first vector, where the first vector for a first step comprises data elements of the source vector operand, and the first vector for other steps comprises the second vector of the preceding step, each step comprising at least one combination operation for combining a data element of the first vector with the at least one additional data element S or another data element of the first vector to generate a data element of the second vector; at least one of said plurality of steps comprises a plurality of combination operations performed in parallel; and at least two of said plurality of steps comprise a combination operation for combining a data element of the first vector with the at least one additional data element S; wherein at least in response to a vector scan operation for which N>G, where M/2≤G<M: the control circuitry is configured to control the processing circuitry to perform said plurality of steps comprising at least one combination step for generating G data elements R[0] to R[G−1] of the result vector operand, and at least one further step for generating data elements R[G] to R[M−1] of said result vector operand when performed in addition to the at least one combination step; said at least one combination step is such that, when said at least one combination step was performed by said processing circuitry without performing the at least one further step, then the processing circuitry would generate data elements [G] to [M−1] of the second vector in a final step of said at least one combination step, wherein for G≤k<M data element [k] of the second vector has a value corresponding to a combination of at least some of data elements V[0] to V[k] of the source vector operand, and in said at least one further step, the control circuitry is configured to control the processing circuitry, for G ≤k<N, to perform a combination operation for combining the at least one additional data element S with data element [k] of the first vector for the at least one further step.
2. The data processing apparatus according to claim 1 , wherein in response to a vector scan operation for which M/2≤N≤G, the control circuitry is configured to control the processing circuitry to perform said at least one combination step and to omit said at least one further step.
3. The data processing apparatus according to claim 1 , wherein the at least one further step comprises at least one combination operation for combining a data element of the first vector with the at least one additional data element S, and does not comprise any combination operations for combining a data element of the first vector with another data element of the first vector.
4. The data processing apparatus according to claim 1 , wherein M=2 P and said at least one combination step comprises P steps for generating data elements [0] to [M−1] of a second vector in response to data elements [0] to [M−1] of a first vector; and for 0≤J<P, step J of said P steps comprises, for 2 J ≤m<N, a combination operation for combining data element [m] of said first vector with data element [m −2 J ] of said first vector to generate data element [m] of said second vector.
5. The data processing apparatus according to claim 4 , wherein M−G is a power of two, and for log 2 (M−G)≤J<P, step J of said P steps comprises, for 2 J −(M−G)≤q<2 J , a combination operation for combining data element [q] of the first vector with said additional data element S to generate data element [q] of the second vector.
6. The data processing apparatus according to claim 1 , wherein the control circuitry is configured to control the processing circuitry to output a carry value corresponding to data element R[N−1] of the result vector operand; wherein when the at least one further step is omitted then the control circuitry is configured to control the processing circuitry to output the carry value earlier than when the at least one further step is performed by the processing circuitry.
7. The data processing apparatus according to claim 1 , wherein the vector scan operation is associated with control information identifying which data elements of the source vector operand are selected data elements, and the control circuitry is configured to control the processing circuitry to process the selected data elements in the vector scan operation.
8. The data processing apparatus according to claim 7 , wherein N has a value such that V[N−1] is the last selected data element of the source vector operand indicated by the control information.
9. The data processing apparatus according to claim 7 , wherein the control circuitry is configured to control the processing circuitry to set data elements of the result vector operand corresponding to non-selected data elements of the source vector operand to one of: (i) a predetermined value; (ii) a value determined by performing the vector scan operation with non-selected data elements of the source vector operand set to a predetermined value; (iii) a value of a corresponding data element in a source register for storing the source vector operand; and (iv) a value of a corresponding data element in a destination register for storing the result vector operand.
10. The data processing apparatus according to claim 1 , wherein the vector scan operation is associated with segment information identifying one or more segments of the source vector, each segment comprising one or more data elements; wherein when the segment information identifies a plurality of segments, then the control circuitry is configured to control the processing circuitry to perform said vector scan operation on data elements of a first segment of said plurality of segments.
11. The data processing apparatus according to claim 10 , where N has a value such that V[N−1] is the last selected data element of the first segment.
12. The data processing apparatus according to claim 10 , wherein when the segment information identifies a plurality of segments, then the control circuitry is configured to control the processing circuitry to generate, for a further segment other than said first segment, at least one result data element within a corresponding further segment of said result vector operand, each result data element within the corresponding further segment having a value corresponding to a combination of one or more data elements of said further segment of the source vector operand.
13. The data processing apparatus according to claim 10 , wherein G=M−1, and when the segment information identifies a plurality of segments, then the control circuitry is configured to control the processing circuitry to omit said at least one further step of said vector scan operation.
14. The data processing apparatus according to claim 1 , wherein the source vector operand comprises X data elements, where X≥M, and the result vector operand comprises Y data elements, where Y≥M.
15. The data processing apparatus according to claim 14 , wherein the control circuitry is configured to control the processing circuitry to perform said vector scan operation on L data elements of the source vector operand to generate L data elements of the result vector operand, where L≤X, and L≤Y; and when L >M, then the L data elements of the result vector operand comprise L−M further data elements R[M] to R[L−1], wherein for M≤i<L result data element R[i] has a value corresponding to a combination of result data element R[M−1] and at least some of data elements V[M] to V[i] of the source vector operand.
16. The data processing apparatus according to claim 15 , wherein when L>M, then N=M, and when L≤M, then N=L.
17. The data processing apparatus according to claim 14 , wherein in each of the plurality of steps, the processing circuitry is configured to generate a maximum of M data elements of the second vector in parallel; and when L>M, then the control circuitry is configured to control the processing circuitry to perform partitioned scan operations separately for a plurality of groups of data elements of the source vector operand, each group comprising M data elements or fewer.
18. The data processing apparatus according to claim 17 , wherein when L>M, then the control circuitry is configured to control the processing circuitry to perform said partitioned scan operation for a first group of said plurality of groups with at least two of said plurality of steps comprising said combination operation for combining a data element of the first vector with the at least one additional data element S, and said plurality of steps comprising said at least one combination step and said at least one further step.
19. The data processing apparatus according to claim 18 , wherein when L>M, then for at least one further group other than said first group, the control circuitry is configured to control the processing circuitry to perform said partitioned scan operation comprising: at least one preliminary step comprising at least one combination operation for combining respective data elements of the further group of data elements; and at least one additional step comprising at least one combination operation for combining a data element generated in said at least one preliminary step for said further group with a data element generated in said partitioned scan operation for said first group of data elements.
20. The data processing apparatus according to claim 19 , wherein the control circuitry is configured to control the processing circuitry to perform said at least one preliminary step for said further group of data elements interleaved with said partitioned scan operation for said first group of data elements.
21. The data processing apparatus according to claim 1 , wherein the at least one additional data element S comprises one of: a scalar operand; a data element of a vector operand; a value determined using a plurality of data elements of a vector operand; and a carry value determined by another vector operation.
22. A non-transitory, computer-readable storage medium storing a computer program which, when executed by a computer, controls the computer to provide a virtual execution environment according to the apparatus of claim 1 .
23. A data processing apparatus comprising: a vector register storage means for storing vector operands comprising a plurality of data elements; processing means for processing vector operands from the vector register storage means; and control means for controlling the processing means to perform a vector scan operation on M data elements of a source vector operand V[0] to V[M−1] and at least one additional data element S, to generate M data elements of a result vector operand R[0] to R[M−1], where M is a power of 2, each data element of the source vector operand and the result vector operand comprises a plurality of bits, and for N≤M and 0≤i<N data element R[i] of the result vector operand has a value corresponding to a combination of the at least one additional data element S and at least some of data elements V[0] to V[i] of the source vector operand; wherein the control means is configured to control the processing means to perform the vector scan operation in a plurality of steps, each step for generating a second vector from a first vector, where the first vector for a first step comprises data elements of the source vector operand, and the first vector for other steps comprises the second vector of the preceding step, each step comprising at least one combination operation for combining a data element of the first vector with the at least one additional data element S or another data element of the first vector to generate a data element of the second vector; at least one of said plurality of steps comprises a plurality of combination operations performed in parallel; and at least two of said plurality of steps comprise a combination operation for combining a data element of the first vector with the at least one additional data element S; wherein at least in response to a vector scan operation for which N>G, where M/2≤G <M: the control means is configured to control the processing means to perform said plurality of steps comprising at least one combination step for generating G data elements R[0] to R[G−1] of the result vector operand, and at least one further step for generating data elements R[G] to R[−1] of said result vector operand when performed in addition to the at least one combination step; said at least one combination step is such that, when said at least one combination step was performed by said processing means without performing the at least one further step, then the processing circuitry would generate data elements [G] to [M−1] of the second vector in a final step of said at least one combination step, wherein for G≤k<M data element [k] of the second vector has a value corresponding to a combination of at least some of data elements V[0] to V[k] of the source vector operand, and in said at least one further step, the control means is configured to control the processing means, for G≤k<N, to perform a combination operation for combining the at least one additional data element S with data element [k] of the first vector for the at least one further step.
24. A data processing method for performing a vector scan operation on M data elements of a source vector operand V[0] to V[M−1] and at least one additional data element S, to generate M data elements of a result vector operand R[0] to R[M−1], where M is a power of 2, each data element of the source vector operand and the result vector operand comprises a plurality of bits, and for N≤M and 0≤i<N data element R[i] of the result vector operand has a value corresponding to a combination of the at least one additional data element S and at least some of data elements V[0] to V[i] of the source vector operand; the method performed using processing circuitry and comprising: performing a plurality of steps for generating a second vector from a first vector, where the first vector for a first step comprises data elements of the source vector operand, and the first vector for other steps comprises the second vector of the preceding step, each step comprising at least one combination operation for combining a data element of the first vector with the at least one additional data element S or another data element of the first vector to generate a data element of the second vector; wherein at least one of said plurality of steps comprises a plurality of combination operations performed in parallel; and at least two of said plurality of steps comprise a combination operation for combining a data element of the first vector with the at least one additional data element S; wherein at least in response to a vector scan operation for which N>G, where M/2≤G <M: said plurality of steps comprise at least one combination step for generating G data elements R[0] to R[G−1] of the result vector operand, and at least one further step for generating data elements R[G] to R[M−1] of said result vector operand when performed in addition to the at least one combination step; said at least one combination step is such that, when said at least one combination step was performed without performing the at least one further step, then the processing circuitry would generate data elements [G] to [M−1] of the second vector in a final step of said at least one combination step, wherein for G≤k<M data element [k] of the second vector has a value corresponding to a combination of at least some of data elements V[0] to V[k] of the source vector operand, and in said at least one further step, the processing circuitry, for G≤k<N, performs a combination operation for combining the at least one additional data element S with data element [k] of the first vector for the at least one further step.
Unknown
June 19, 2018
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.