Flink 系例 之 Connectors 读写 txt文件

原创
02/04 14:48
阅读数 1.6K

通过使用Flink DataSet Connectors 数据流连接器打开txt文件,并提供数据流输入与输出操作;

示例环境

java.version: 1.8.x
flink.version: 1.11.1

示例数据源 (项目码云下载)

Flink 系例 之 搭建开发环境与数据

示例模块 (pom.xml)

Flink 系例 之 DataStream Connectors 与 示例模块

数据流输入

TextSource.java

package com.flink.examples.file;

import org.apache.commons.lang3.StringUtils;
import org.apache.commons.lang3.time.DateUtils;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.util.Collector;
import scala.Tuple7;

import java.util.Date;

/**
 * @Description 从txt文件中读取内容输出到DataSet中
 */
public class TextSource {

    public static void main(String[] args) throws Exception {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        String filePath = "D:\\Workspaces\\idea_2\\flink-examples\\connectors\\src\\main\\resources\\user.txt";
        DataSet<Tuple7<Integer, String, Integer, Integer, String, Date, Long>> dataSet = env.readTextFile(filePath)
                .flatMap(new FlatMapFunction<String, Tuple7<Integer, String, Integer, Integer, String, Date, Long>>() {
                    @Override
                    public void flatMap(String value, Collector<Tuple7<Integer, String, Integer, Integer, String, Date, Long>> out) throws Exception {
                        if (StringUtils.isNotBlank(value)) {
                            String[] values = value.split(",");
                            out.collect(Tuple7.apply(
                                    Integer.parseInt(values[0]),
                                    values[1],
                                    Integer.parseInt(values[2]),
                                    Integer.parseInt(values[3]),
                                    values[4],
                                    DateUtils.parseDate(values[5], "yyyy-MM-dd HH:mm:ss"),
                                    Long.parseLong(values[6])));
                        }
                    }
                });
        dataSet.print();
    }

}

数据流输出

TextSink.java

package com.flink.examples.file;

import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.types.Row;

/**
 * @Description 将DataSet数据写入到txt文件中
 */
public class TextSink {

    public static void main(String[] args) throws Exception {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        //需先建立文件
        String filePath = "D:\\Workspaces\\idea_2\\flink-examples\\connectors\\src\\main\\resources\\user.txt";
        //将实体转换为Row对象,new Row(字段个数)
        Row row = Row.of(15, "chen1", 40, 1, "CN", "2020-09-08 00:00:00", 1599494400000L);
        //转换为dataSet
        DataSet<Row> dataSet = env.fromElements(row);
        //将内容写入到File中,如果文件已存在,将会被复盖
        dataSet.writeAsText(filePath, FileSystem.WriteMode.OVERWRITE).setParallelism(1);
        env.execute("fline file sink");
    }

}

数据展示

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部