org.apache.spark.aliyun.odps

OdpsOps

class OdpsOps extends Logging with Serializable

Linear Supertypes
Serializable, Serializable, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. OdpsOps
  2. Serializable
  3. Serializable
  4. Logging
  5. AnyRef
  6. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new OdpsOps(sc: SparkContext, accessKeyId: String, accessKeySecret: String, odpsUrl: String, tunnelUrl: String)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. val account: AliyunAccount

  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. val dateFormat: SimpleDateFormat

  10. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  12. def fakeClassTag[T]: ClassTag[T]

  13. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  15. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  17. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  18. def loadOdpsTable(sqlContext: SQLContext, project: String, table: String, cols: Array[Int], numPartition: Int): DataFrame

    Load ODPS table into org.apache.spark.sql.DataFrame.

    Load ODPS table into org.apache.spark.sql.DataFrame.

    val sqlContext = ...
    val odpsOps = ...
    val odpstableDF = odpsOps.loadOdpsTable(sqlContext, "odps-project", "odps-table", Array(0, 2, 3), 2)
    sqlContext

    A Spark SQL context

    project

    The name of ODPS project.

    table

    The name of table, which job is reading.

    cols

    Implying to load which columns, i.e. Array(0, 1, 3).

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A DataFrame which contains relevant records of ODPS table.

  19. def loadOdpsTable(sqlContext: SQLContext, project: String, table: String, partition: String, cols: Array[Int], numPartition: Int): DataFrame

    Load ODPS table into org.apache.spark.sql.DataFrame.

    Load ODPS table into org.apache.spark.sql.DataFrame.

    val sqlContext = ...
    val odpsOps = ...
    val odpstableDF = odpsOps.loadOdpsTable(sqlContext, "odps-project", "odps-table", "odps-partition", Array(0, 2, 3), 2)
    sqlContext

    A Spark SQL context

    project

    The name of ODPS project.

    table

    The name of table, which job is reading.

    partition

    The name of partition, when job is reading a Partitioned Table, like pt='xxx'.

    cols

    Implying to load which columns

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A DataFrame which contains relevant records of ODPS table.

  20. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  21. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  22. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  23. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  24. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  25. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  26. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  27. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  28. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  29. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  30. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  31. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  32. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  33. final def notify(): Unit

    Definition Classes
    AnyRef
  34. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  35. val odps: Odps

  36. val odpsUtils: OdpsUtils

  37. def readTable[T](project: String, table: String, transfer: (Record, TableSchema) ⇒ T, numPartition: Int)(implicit arg0: ClassTag[T]): RDD[T]

    Read table from ODPS.

    Read table from ODPS.

    val odpsOps = ...
    val odpsTable = odpsOps.readTable("odps-project", "odps-table", readFunc, 2)
    
    def readFunc(record: Record, schema: TableSchema): Array[Long] = {
      val ret = new Array[Long](schema.getColumns.size())
      for (i <- 0 until schema.getColumns.size()) {
        ret(i) = record.getString(i).toLong
      }
      ret
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is reading.

    transfer

    A function for transferring ODPS table to org.apache.spark.rdd.RDD. We apply the function to all com.aliyun.odps.data.Record of table.

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A RDD which contains all records of ODPS table.

    Annotations
    @unchecked()
  38. def readTable[T](project: String, table: String, partition: String, transfer: (Record, TableSchema) ⇒ T, numPartition: Int)(implicit arg0: ClassTag[T]): RDD[T]

    Read table from ODPS.

    Read table from ODPS.

    val odpsOps = ...
    val odpsTable = odpsOps.readTable("odps-project", "odps-table", "odps-partition", readFunc, 2)
    
    def readFunc(record: Record, schema: TableSchema): Array[Long] = {
      val ret = new Array[Long](schema.getColumns.size())
      for (i <- 0 until schema.getColumns.size()) {
        ret(i) = record.getString(i).toLong
      }
      ret
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is reading.

    partition

    The name of partition, when job is reading a Partitioned Table, like pt='xxx'.

    transfer

    A function for transferring ODPS table to org.apache.spark.rdd.RDD. We apply the function to all com.aliyun.odps.data.Record of table.

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A RDD which contains all records of ODPS table.

    Annotations
    @unchecked()
  39. def readTableWithJava[R](project: String, table: String, transfer: Function2[Record, TableSchema, R], numPartition: Int): JavaRDD[R]

    Read table from ODPS.

    Read table from ODPS.

    OdpsOps odpsOps = ...
    static class RecordToLongs implements Function2<Record, TableSchema, List<Long>> {
      @Override
      public List<Long> call(Record record, TableSchema schema) throws Exception {
          List<Long> ret = new ArrayList<Long>();
          for (int i = 0; i < schema.getColumns().size(); i++) {
              ret.add(Long.valueOf(record.getString(i)));
          }
          return ret;
      }
     }
    
     JavaRDD<List<Long>> readData = odpsOps.readTableWithJava("odps-project", "odps-table", new RecordToLongs(), 2);
    project

    The name of ODPS project.

    table

    The name of table from which the job is reading

    transfer

    A function for transferring ODPS table to org.apache.spark.api.java.JavaRDD. We apply the function to all com.aliyun.odps.data.Record of table.

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A JavaRDD which contains all records of ODPS table.

  40. def readTableWithJava[R](project: String, table: String, partition: String, transfer: Function2[Record, TableSchema, R], numPartition: Int): JavaRDD[R]

    Read table from ODPS.

    Read table from ODPS.

    OdpsOps odpsOps = ...
    static class RecordToLongs implements Function2<Record, TableSchema, List<Long>> {
      @Override
      public List<Long> call(Record record, TableSchema schema) throws Exception {
          List<Long> ret = new ArrayList<Long>();
          for (int i = 0; i < schema.getColumns().size(); i++) {
              ret.add(Long.valueOf(record.getString(i)));
          }
          return ret;
      }
     }
    
     JavaRDD<List<Long>> readData = odpsOps.readTableWithJava("odps-project", "odps-table", "odps-partition", new RecordToLongs(), 2);
    project

    The name of ODPS project.

    table

    The name of table, which job are reading.

    partition

    The name of partition, when job is reading a Partitioned Table, like pt='xxx'.

    transfer

    A function for transferring ODPS table to org.apache.spark.api.java.JavaRDD. We apply the function to all com.aliyun.odps.data.Record of table.

    numPartition

    The number of RDD partition, implying the concurrency to read ODPS table.

    returns

    A JavaRDD which contains all records of ODPS table.

  41. def saveToTable[T](project: String, table: String, rdd: RDD[T], transfer: (T, Record, TableSchema) ⇒ Unit)(implicit arg0: ClassTag[T]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    val odpsOps = ...
    val data: RDD[Array[Long]] = ...
    odps.saveToTable("odps-project", "odps-table", data, writeFunc)
    
    def writeFunc(kv: Array[Long], record: Record, schema: TableSchema) {
      for (i <- 0 until schema.getColumns.size()) {
        record.setString(i, kv(i).toString)
      }
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is writing.

    rdd

    A org.apache.spark.rdd.RDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.rdd.RDD to ODPS table. We apply the function to all elements of RDD.

    Annotations
    @unchecked()
  42. def saveToTable[T](project: String, table: String, partition: String, rdd: RDD[T], transfer: (T, Record, TableSchema) ⇒ Unit, defaultCreate: Boolean, overwrite: Boolean)(implicit arg0: ClassTag[T]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    val odpsOps = ...
    val data: RDD[Array[Long]] = ...
    odps.saveToTable("odps-project", "odps-table", "odps-partition", data, writeFunc, false, false)
    
    def writeFunc(kv: Array[Long], record: Record, schema: TableSchema) {
      for (i <- 0 until schema.getColumns.size()) {
        record.setString(i, kv(i).toString)
      }
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table, lie pt='xxx'.

    rdd

    A org.apache.spark.rdd.RDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.rdd.RDD to ODPS table. We apply the function to all elements of RDD.

    defaultCreate

    Implying whether to create a table partition, if the specific partition does not exist.

    overwrite

    Implying whether to overwrite the specific partition if exists. NOTE: only support overwriting partition, not table.

    Annotations
    @unchecked()
  43. def saveToTable[T](project: String, table: String, partition: String, rdd: RDD[T], transfer: (T, Record, TableSchema) ⇒ Unit, defaultCreate: Boolean)(implicit arg0: ClassTag[T]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    val odpsOps = ...
    val data: RDD[Array[Long]] = ...
    odps.saveToTable("odps-project", "odps-table", "odps-partition", data, writeFunc, false)
    
    def writeFunc(kv: Array[Long], record: Record, schema: TableSchema) {
      for (i <- 0 until schema.getColumns.size()) {
        record.setString(i, kv(i).toString)
      }
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table.

    rdd

    A org.apache.spark.rdd.RDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.rdd.RDD to ODPS table. We apply the function to all elements of RDD.

    defaultCreate

    Implying whether to create a table partition, if the specific partition does not exist.

  44. def saveToTable[T](project: String, table: String, partition: String, rdd: RDD[T], transfer: (T, Record, TableSchema) ⇒ Unit)(implicit arg0: ClassTag[T]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    val odpsOps = ...
    val data: RDD[Array[Long]] = ...
    odps.saveToTable("odps-project", "odps-table", "odps-partition", data, writeFunc)
    
    def writeFunc(kv: Array[Long], record: Record, schema: TableSchema) {
      for (i <- 0 until schema.getColumns.size()) {
        record.setString(i, kv(i).toString)
      }
    }
    project

    The name of ODPS project.

    table

    The name of table, which job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table, like pt='xxx'.

    rdd

    A org.apache.spark.rdd.RDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.rdd.RDD to ODPS table. We apply the function to all elements of RDD.

  45. def saveToTableWithJava[T](project: String, table: String, javaRdd: JavaRDD[T], transfer: Function3[T, Record, TableSchema, Unit]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    OdpsOps odpsOps = ...
    JavaRDD<List<Long>> data = ...
    static class SaveRecord implements Function3<List<Long>, Record, TableSchema, BoxedUnit> {
      @Override
      public BoxedUnit call(List<Long> data, Record record, TableSchema schema) throws Exception {
          for (int i = 0; i < schema.getColumns().size(); i++) {
              record.setString(i, data.get(i).toString());
          }
          return null;
      }
    }
    
    odpsOps.saveToTableWithJava("odps-project", "odps-table", data, new SaveRecord());
    project

    The name of ODPS project.

    table

    The name of table to which the job is writing.

    javaRdd

    A org.apache.spark.api.java.JavaRDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.api.java.JavaRDD to ODPS table. We apply the function to all elements of JavaRDD.

  46. def saveToTableWithJava[T](project: String, table: String, partition: String, javaRdd: JavaRDD[T], transfer: Function3[T, Record, TableSchema, Unit], defaultCreate: Boolean, overwrite: Boolean): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    OdpsOps odpsOps = ...
    JavaRDD<List<Long>> data = ...
    static class SaveRecord implements Function3<List<Long>, Record, TableSchema, BoxedUnit> {
      @Override
      public BoxedUnit call(List<Long> data, Record record, TableSchema schema) throws Exception {
          for (int i = 0; i < schema.getColumns().size(); i++) {
              record.setString(i, data.get(i).toString());
          }
          return null;
      }
    }
    
    odpsOps.saveToTableWithJava("odps-project", "odps-table", "odps-partition", data, new SaveRecord(), false, false);
    project

    The name of ODPS project.

    table

    The name of table to which the job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table, like pt='xxx'.

    javaRdd

    A org.apache.spark.api.java.JavaRDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.api.java.JavaRDD to ODPS table. We apply the function to all elements of JavaRDD.

    defaultCreate

    Implying whether to create a table partition, if the specific partition does not exist.

    overwrite

    Implying whether to overwrite the specific partition if exists. NOTE: only support overwriting partition, not table.

  47. def saveToTableWithJava[T](project: String, table: String, partition: String, javaRdd: JavaRDD[T], transfer: Function3[T, Record, TableSchema, Unit], defaultCreate: Boolean): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    OdpsOps odpsOps = ...
    JavaRDD<List<Long>> data = ...
    static class SaveRecord implements Function3<List<Long>, Record, TableSchema, BoxedUnit> {
      @Override
      public BoxedUnit call(List<Long> data, Record record, TableSchema schema) throws Exception {
          for (int i = 0; i < schema.getColumns().size(); i++) {
              record.setString(i, data.get(i).toString());
          }
          return null;
      }
    }
    
    odpsOps.saveToTableWithJava("odps-project", "odps-table", "odps-partition", data, new SaveRecord(), false);
    project

    The name of ODPS project.

    table

    The name of table to which the job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table, like pt='xxx'.

    javaRdd

    A org.apache.spark.api.java.JavaRDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.api.java.JavaRDD to ODPS table. We apply the function to all elements of JavaRDD.

    defaultCreate

    Implying whether to create a table partition, if specific partition does not exist.

  48. def saveToTableWithJava[T](project: String, table: String, partition: String, javaRdd: JavaRDD[T], transfer: Function3[T, Record, TableSchema, Unit]): Unit

    Save a RDD to ODPS table.

    Save a RDD to ODPS table.

    OdpsOps odpsOps = ...
    JavaRDD<List<Long>> data = ...
    static class SaveRecord implements Function3<List<Long>, Record, TableSchema, BoxedUnit> {
      @Override
      public BoxedUnit call(List<Long> data, Record record, TableSchema schema) throws Exception {
          for (int i = 0; i < schema.getColumns().size(); i++) {
              record.setString(i, data.get(i).toString());
          }
          return null;
      }
    }
    
    odpsOps.saveToTableWithJava("odps-project", "odps-table", "odps-partition", data, new SaveRecord());
    project

    The name of ODPS project.

    table

    The name of table to which the job is writing.

    partition

    The name of partition, when job is writing a Partitioned Table, like pt='xxx'.

    javaRdd

    A org.apache.spark.api.java.JavaRDD which will be written into a ODPS table.

    transfer

    A function for transferring org.apache.spark.api.java.JavaRDD to ODPS table. We apply the function to all elements of JavaRDD.

  49. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  50. def toString(): String

    Definition Classes
    AnyRef → Any
  51. val tunnel: DataTunnel

  52. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  53. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  54. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped